I would like to ask you for help. I encountered a problem where, when I'm importing JSON into mongodb via compass, it throws a duplicate _id error. Therefore, I tried to go to the terminal and go through mongoimport, which runs successfully and informs me that each document was imported without error, but I see that the documents are missing. Can you give me some advice on how to solve this problem?
This is terminal command in windows cmd
mongoimport D:\DimplomaThesis_data\transfer_json\180000-190000.json -d diplomovka -c transfer --jsonArray --stopOnError --maintainInsertionOrder --upsertFields _id
This is structure of record in JSON array:
{
"_id":"5d6566d086dc8b72382bc376",
"name":"Peter",
"surname":"Zubrík",
"titles":{
"before":"",
"after":""
},
"sex":"M",
"citizenship":"SVK",
"birthyear":1991,
"age":31,
"transfer":{
"source_ppo":"tj-polana-siba.futbalnet.sk",
"org_profile_id":"sportovnik-klub-fc-mukarov.futbalnet.sk",
"org_id":"5d5d3974eccb8850917918cd",
"sector":{
"_id":"sport:futbal:futbal",
"category":"sport",
"itemId":"futbal",
"sectorId":"futbal"
},
"competence_type":"player",
"transfer_type":"transfer",
"issfMoveType":"PWP",
"date_from":"2014-05-09T00:00:00.000Z",
"date_to":null,
"_id":"62e6d12c0ae29819010f611f",
"org_profile_name":"Sportovník klub FC Mukařov",
"org_name":"Sportovník klub FC Mukařov",
"source_ppo_name":"TJ Poľana Šiba"
},
"issfId":"1208658"
}
"_id":"5d6566d086dc8b72382bc376" this could have multiple records in array same. I download data from APIs, around 30 JSON each contain 10.000 records. Ideally import all document to mongodb and next create pipeline in compass.
I found solution for my problem.
I need to use python for creating compound_id (new primary key - unique identifier for each record in array (json)).
this code work for me:
Basically I created new modify json file and this file I import through Mongo Compass where import finish with 0 error (error duplicate _id)