Unnesting a JSON column that has multiple nested values in python

88 Views Asked by At

I'm trying to unnest a json data from a cURL query. I'm able to unnest upto level 1. The challenge is that a specific column has multiple levels and the next level of the column name starts with an underscore.

import requests as r
import json
import pandas as pd

data = r.get(url,auth=('uname','pwd'), verify=False)
response_dict = data.json()


df = pd.json_normalize(response_dict)
df1=df.explode(['hits.hits'])
df1

level 1 print

enter image description here

when i add this code and print

df2 = df1.explode(['hits.hits'])
df2

enter image description here

However the columns in hits.hits which has multiple columns which start with an underscore like _index, _type which again has multiple level. I'm unable to unnest them. any idea how i could get this done. Trying to explode using

df2.explode['hits.hits._source']

Gives me key error hits.hits._source

Regards, Renato.

1

There are 1 best solutions below

2
Shatha Al-Bajly On
import requests
import pandas as pd

# Make your API request
data = requests.get(url, auth=('uname', 'pwd'), verify=False)
response_dict = data.json()

# Normalize the JSON data
df = pd.json_normalize(response_dict)

# Explode the 'hits.hits' column
df1 = df.explode('hits.hits')

# Explode the nested columns within 'hits.hits._source'
df2 = df1.explode('hits.hits._source')

# If there are additional nested columns within 'hits.hits._source', you can continue exploding them
# For example, if there's a column 'hits.hits._source.some_column', you can explode it like this:
# df3 = df2.explode('hits.hits._source.some_column')

# Now df2 contains the unnested data
print(df2)