Airflow GCSToBigQueryOperator request too large - Error 413 Request Entity Too Large

38 Views Asked by At

I have a DAG running on Airflow / Cloud Composer that's executed each night. During the day there are different systems that copy .csv files on GCS bucket.

When my DAG is triggered it looks in the GCS bucket and identifies the new files added during the day using a PythonOperator and pushes the file paths to xcom as a list. The next GCSToBigQueryOperator that moves the data in a BigQuery table.

Sometimes, the number of files is very high and it makes the operator's request very large (30+ MB) and BigQuery responds with Error 413 (Request Entity Too Large).

Do you have any suggestions on a workaround for this?

0

There are 0 best solutions below