I have a dataset in R that is in this format:
| Name | Company | submitted_data |
|---|---|---|
| Bob | Bob Inc. | {"dob":2002-03-04, "tel": 1234} |
| Fadela | Fadela Co. | NULL |
| Andy | Andrew Inc. | {"dob": 1999-10-10, "industry": retail} |
I wish to extract the data in the "submitted_data" column into separate columns, using the respective values and preserving NULLs, for example the above should look something like:
| Name | Company | dob | tel | industry |
|---|---|---|---|---|
| Bob | Bob Inc. | 2002-03-04 | 1234 | null |
| Fadela | Fadela Co. | NULL | NULL | NULL |
| Andy | Andrew Inc. | 1999-10-10 | null | retail |
I know I need to use the jsonlite package, but so far it's thrown up errors and not been able to get anywhere. Thank you.
Essentially your problem is that your column is not valid json:
We can use the pattern
gsub(":(.*?)([,}])", ':"\\1"\\2', txt)to replace all instances of:followed by a value with that value in quotes (e.g. replacing"dob" : 2002-03-04with"dob" : "2002-03-04").I've used
dplyr::bind_rows()here as it's an easy way to bind a list of json objects which do not have the same keys for each row.Input data: