Insert a bulk of records into the Postgresql DB using Sails.js and Waterline ORM

71 Views Asked by At

I need to insert a bulk of records into the Postgresql database using the Sails.js framework and the Waterline ORM. I have a file with 100k records of users and their links:

// users.txt
line 1 {"firstname":"name1","lastname":"last1","email":"[email protected]","phone":"+3801","links":[1]}
line 2 {"firstname":"name2","lastname":"last2","email":"[email protected]","phone":"+3802","links":[2]}
line 3 {"firstname":"name3","lastname":"last3","email":"[email protected]","phone":"+3803","links":[3]}
...

// links.txt
line 1 {"url":"http://linkedin.com/1","socialNetwork":3,"user":1}
line 2 {"url":"http://facebook.com/2","socialNetwork":2,"user":2}
line 3 {"url":"http://twitter.com/3","socialNetwork":1,"user":3}
...

I have the users model and the links model, with the many-to-many associations:

// User.js
module.exports = {
  tableName: 'users',
  attributes: {
  firstname: {
      type: 'string',
      required: true,
    },
  lastname: {
      type: 'string',
      required: true,
    },
  email: {
      type: 'string',
      required: true,
      unique: true,
    },
  phone: {
      type: 'string',
      unique: true,
   },
  links: {
      collection: 'link',
      via: 'user',
    },
  },
};

// Link.js

module.exports = {
  tableName: 'links',
  schema: false,
  attributes: {
  url: {
      type: 'string',
      required: true,
      unique: true,
    },
  socialNetwork: {
      model: 'socialnetwork',
    },
  user: {
      collection: 'user',
      via: 'links',
    },
  },
};

and an endpoint to generate the records and save them to the database. I've tried to read the data from the file synchronously and then insert it into the database with chunks of 10k records, 10 times:

generateUsers: async (req, res) => {
    const links = readFileSync('./api/db/links.txt').toString().replace(/\r\n/g, '\n').split('\n');
    const users = readFileSync('./api/db/users.txt').toString().replace(/\r\n/g, '\n').split('\n');

    const mappedL = links.map((l) => JSON.parse(l));
    const mappedU = users.map((l) => JSON.parse(l));

    await sails.getDatastore().transaction(async (db) => {
      const chunk = links.length > 10000 ? 10000 : links.length;

      for (let i = 0; i < users.length; i += chunk) {
        let chunkArray = mappedL.slice(i, i + chunk);
        Link.createEach(chunkArray).usingConnection(db) ;

        chunkArray = mappedU.slice(i, i + chunk);
        await User.createEach(chunkArray).usingConnection(db);
      }
    });
    return res.ok(users.length + ' users were generated and inserted.');

But that takes a long time to insert... and I have no idea, to be honest, how to manage this (via Sails.js and Waterline ORM). I've thought about doing it via streams (reading from the file and simultaneously inserting it into the database), but I couldn't figure out how. I would appreciate any help, thank you!

0

There are 0 best solutions below