Unioned table costs more to query in BigQuery than individual tables?

148 Views Asked by At

BigQuery cost scenarios

1

When I query a large unioned table - partitioned by date field and clustered by a clientkey field - for a specific client's data it appears to process more data than if I just queried that client table individually. Same query, should be the exact same data from different tables; massively different cost.

Does anyone know why it costs more to query a partitioned/clustered unioned table compared to the same data from the individual client-specific table?

I'm trying to make the case for still keeping this data unioned and partitioned+clustered as opposed to individual datasets! Thanks!

1

There are 1 best solutions below

2
Yun Zhang On

There is factor which may affect your scenario, however, the factor is not a contract so this answer may be irrelevant over time.

The assumptions are:

  1. the partitioned table is clustered
  2. the individual table is also clustered
  3. the query utilized clustering and touched only small amount of data (comparing with the cluster block size)

In such case, the cluster block size might affect the cost. Since the individual table is much smaller than the partitioned table, the individual table tends to have smaller cluster block size. The query is eventually charged by the combined size of blocks getting scanned.