Mongodb cluster - importing billions of records

673 Views Asked by karthik kalletla At 17 May 2025 at 09:19

I am trying to import 10 billion records. Started tested with importing 1 billion records. Import time getting worst as the records gets inserted. Here are configurations and stats.

Mongo db version - 3.4
Documents - 1226592923
Routers(m4.xlarge) 2 
Config 3
Nodes(i3.large,15GB nvme ssd)  Import time(hrs)
5                              14:30:00
10                             8:10:00

Each Document has around 7 fields. Shard key is on 3 fields. Followed all the recommendations at https://docs.mongodb.com/v3.4/reference/ulimit/#recommended-ulimit-settings.

Import options

--writeConcern '{ w: 0, j: false }'
--numInsertionWorkers 8

Even tried disabling journal(--nojournal), but no much difference.

Not sure if this is the expected import time. Or is the way I can do anything else to improve ingestion rate?

Original Q&A

There are 1 best solutions below

karthik kalletla On 20 December 2017 at 14:01

Here are some of the factors made a lot of improvement in importing

Pre-splitting
Sorting data
Disabling the balancer sh.stopBalancer()
Turning off auto split during load(sh.disableAutoSplit() or restart the mongos without --noAutoSplit)
Indexing after loading complete data

References:

Mongodb cluster - importing billions of records

There are 1 best solutions below

Related Questions in MONGODB

Related Questions in MONGOIMPORT

Related Questions in MONGODB-CLUSTER

Trending Questions

Popular # Hahtags

Popular Questions