I would like to flatten a json request from an API. I have been trying to use pandas to achieve this.
Here is a sample of the json I am working with:
response =
{
"data": {
"getTestExecutions": {
"total": 50,
"start": 1,
"limit": 100,
"results": [
{
"issueId": "12345",
"jira": {
"key": "test-1234",
"summary": "test- JSON",
"project": {
"self": "some value",
"id": "4321",
"key": "tst",
"name": "test",
"projectTypeKey": "software",
"simplified": false,
"avatarUrls": {
"key" : "value",
"key1" : "value1",
"key2" : "value1",
"key4" : "value4"
}
}
},
"tests": {
"total": 25,
"start": 0,
"limit": 100,
"results": [
{
"issueId": "1234",
"testType": {
"name": "Manual"
},
"jira": {
"key": "example-123",
"summary": "sample example"
}
},
{
"issueId": "12345",
"testType": {
"name": "Manual"
},
"jira": {
"key": "example-124",
"summary": "sample example 1"
}
}
]
},
"testRuns": {
"results": [
{
"status": {
"name": "PASSED"
},
"executedById": "user id",
"startedOn": "date here",
"finishedOn": "date here",
"assigneeId": "assigneeId"
},
{
"status": {
"name": "PASSED"
},
"executedById": "executedBy id",
"startedOn": "start date",
"finishedOn": "end date",
"assigneeId": "assigneeId"
}
]
}
}
]
}
}
}
this is my code so far:
df = pd.json_normalize(d['data']['getTestExecutions']['results'])
df_test_results = df.explode('tests.results')
df_test_results_Norm = pd.json_normalize(df_test_results['tests.results']).add_prefix('test.')
df_testRun_results = df.explode('testRuns.results')
df_testRun_results_Norm = pd.json_normalize(df_testRun_results['testRuns.results']).add_prefix('testRuns.')
df = pd.concat([df.drop(['tests.results','testRuns.results'],axis=1),df_test_results_Norm, df_testRun_results_Norm],axis=1)
The problems I run into is I want every column to be the keys of the dictionaries (without losing any) while maintaining order.I want a single dataframe with the parent records repeating for each of the child records. What I mean by this is as follows:
- Test Executions is the parent and tests/testRuns are the children
- For every test execution I could have 100 tests, and 100 testRuns. I want the parent record in the dataframe to apply to each child record
Right now my code produces a dataframe with N/A values for the blanks because, I am concatenating the results of different sizes.
Anyone know the best approach to do this? Thanks for the help.
Code:
Parse parent and children separately then merge all 3 with reduce.
Output: