How to push or pull data from AWS to Google Big Query

26 Views Asked by At

We have an application hosted at AWS with a relational database (Aurora PostgreSql). Our customer wants to have the data replicated on a daily basis to its Google Big Query instance. The amount of data is larger than 1 TB so a full load every day is not an option so we have to use some kind of change data capture. For security reasons I don’t want to expose our database in the public net and for other reason a VPN Site-to-Site connection has been ruled out. So we thought about using AWS Database Migration Service with CDC and to export the data increments to a S3 bucket.

The question is now how does the customer get this data into its Google BigQuery from there on or is there any other way that we may have overlooked.

Björn

VPN connection ruled out, thus use of Qlik Replicate not possible.

1

There are 1 best solutions below

0
shamiso On

One way could be to transfer daily data into Google Cloud storage and use the date as part of the naming convention. You can then set up a trigger (using Google Cloud functions) on the Google Cloud bucket to automatically transfer the data to a Big Query data set. In this data set, you can transfer the data into the same table, appending but partitioning by date or you can have sharded tables.