How would you use AWS Data Pipeline, Elastic MapReduce, and Redshift to perform ETL and data warehousing?

76 Views Asked by Abi Manoharan At 16 October 2022 at 13:27

I'm very new to data warehousing and AWS.

For school, we have to make a presentation on how data warehousing can be performed using the following three technologies:

Redshift
AWS Data Pipeline
Elastic MapReduce

This is my understanding thus far:

Redshift is the data warehouse platform where you would store your data to perform analysis and business intelligence activities.
AWS Data Pipeline can be used to schedule tasks and operations. Somehow it can also be used for data transformation
Elastic MapReduce can also be used for data transformation.

I just don't understand how you would used these things together to perform data warehousing activities. Would you use the Data Pipeline to schedule ETL processes in map reduce and then transfer data to RedShift? If so, how can you do that?

Original Q&A

There are 1 best solutions below

pratik On 18 October 2022 at 12:49

Have explained the data flow here via diagram. We need to use EMR jobs when we need to find insights on large volume of data.

We can run SQL query on Redshift too in this case but assume some complex operation which can't be solved via SQL query.

How would you use AWS Data Pipeline, Elastic MapReduce, and Redshift to perform ETL and data warehousing?

There are 1 best solutions below

Related Questions in AMAZON-WEB-SERVICES

Related Questions in AMAZON-REDSHIFT

Related Questions in AMAZON-EMR

Related Questions in DATA-WAREHOUSE

Related Questions in AMAZON-DATA-PIPELINE

Trending Questions

Popular # Hahtags

Popular Questions