How do I migrate data from one table to another in Amazon Keyspaces?

162 Views Asked by At

What is the best practice for migrate data from a table (large data, around 100G), with a query to obtain a subset of data, to another table?

I tried with Spark and DSBulk but both have pre-requisite to use a csv file, and that's not my scenario.

2

There are 2 best solutions below

0
Erick Ramirez On

Cassandra's CQL does not have the equivalent of SQL's INSERT INTO table SELECT ... command that copies data from one table into another as a single statement because this requires a full table scan that doesn't scale for large datasets and/or large clusters.

You will need to export the contents of the source table, do the relevant transformations to the data, then import it to the destination table.

The simplest way to do this is to export the data to a CSV file using DSBulk, manually manipulate the CSV file so it only contains the data you want, then bulk-load the updated CSV into the new table using DSBulk.

Since you don't want to deal with CSV files, the only option you have is to write a Spark or Java application to read the data one-by-one, transform it then write the transformed data to the new table. Cheers!

0
meet-bhagdev On

You can use the built-in point in time restore capability to copy one Amazon Keyspaces table into another.

  1. Turn on the point in time restore feature
  2. Create a new table using the most recent snapshot (upto a second's granularity)

See this link with more details - https://docs.aws.amazon.com/keyspaces/latest/devguide/PointInTimeRecovery.html