BigQuery - How to configure write disposition for a insert_rows job?

1.3k Views Asked by At

I'm creating a flow to insert rows in a BigQuery table from a list of dictionaries with the following format:

[
{'column1': 'value11', 'column2': 'value21', 'column3': [{'subfield1':'value131'}]},
{'column1': 'value12', 'column2': 'value22', 'column3': [{'subfield1':'value231'}]},
{'column1': 'value12', 'column2': 'value22', 'column3': [{'subfield1':'value331'}]}
]

I'm testing yet and wish to write truncate the table, but I don't know how to set the job_config for an insert_rows method. I set up a LoadJobConfig object as follows, but the insert_rows method doesn't accept job_config as a parameter.

job_config = bigquery.LoadJobConfig(
    schema=schema,
    write_disposition=bigquery.WriteDisposition.WRITE_TRUNCATE
)

I'm not loading the from a DataFrame or a file because my table has nested fields and I didn't find out how to load nested data in those formats.

Can anyone help me?

2

There are 2 best solutions below

2
Mazlum Tosun On

You can't pass JobConfig with WRITE_TRUNCATE mode. You have to execute a separated query to truncate the table, then append all your dicts to BigQuery with insert_rows_json :

your_dicts = [
   {'column1': 'value11', 'column2': 'value21', 'column3': [{'subfield1':'value131'}]},
   {'column1': 'value12', 'column2': 'value22', 'column3': [{'subfield1':'value231'}]},
   {'column1': 'value12', 'column2': 'value22', 'column3': [{'subfield1':'value331'}]}
]

# Truncate the table
bq_client = bigquery.Client()
query_job = bq_client.query(f"TRUNCATE table {table_id}")

results = query_job.result()

# Append your list of Dict to BQ
job = bq_client.insert_rows_json(table_id, your_dicts)

The insert_rows_json uses the following insertAll api.

1
Stergios On

i don't understand mazlum's answer, here it says you can use write_truncate https://cloud.google.com/bigquery/docs/reference/rest/v2/Job