I have some Oracle and MySQL databases migrated to DolphinDB with dataX and mysql plugins, and I want to check if the migrated data is consistent with the source.
I’ve tried the following 2 methods:
Compare the number of records between source and target databases. The problem is
countin MySQL is very inefficient and it takes more than a day. Besides, this method only validates the volume of records, not the data consistency.Retrieve data in batches and check the values. High volume of data still affects the performance.
Is there any efficient way to check the data consistency between the source and target database?
My solution verifies the data consistency of data migration where table with the largest data volume contains 900 million records with more than 30 fields.
To validate the data consistency, a primary key is required to sort the data and calculate MD5 value of each record.
This method calculates the final MD5 values of the source and target databases and compares the data:
Tip: The calculation can be done in SQL. To improve the performance, you can sort the data and split it into groups, and check the data consistency concurrently using Python.