How to use dynamic sql when defining a set of tables in GCP Dataform

229 Views Asked by At

I am trying to teach myself how to use Dataform to make my bigquery processes more scalable. I currently utilize stored procedures, but would like to transfer these processes to Dataform.

I'm stuck on Dataform protocol to reference a dataset variable when creating tables or using dynamic sql when creating a procedure that creates dynamic set of tables. Here is an example of what I'm referring to (partial snapshot of a stored procedure I use)

SET
  schemas = ARRAY(
  SELECT
    schema_name
  FROM
    my_project.INFORMATION_SCHEMA.SCHEMATA);
SET
  arrSize = ARRAY_LENGTH(schemas);
SET
  query = CONCAT("CREATE OR REPLACE TABLE `my_project.subset_of_datasets` as (select distinct(table_schema) from (select * from (");
WHILE
  i < arrSize -1 DO
SET
  query = CONCAT(query, "(select table_name, table_schema from my_project.",schemas[
  OFFSET
    (i)],".INFORMATION_SCHEMA.TABLES where contains_substr(table_name, 'p_AccountDeviceStat') and contains_substr(ddl, 'dpb_revenue') and contains_substr(ddl, 'dpb_transactions')) UNION ALL ");
SET
  i = i + 1;
END WHILE
  ;
SET
  query = CONCAT(query, "(select table_name, table_schema from my_project.",schemas[ORDINAL(arrSize)],".INFORMATION_SCHEMA.TABLES where contains_substr(table_name, 'p_AccountDeviceStat') and contains_substr(ddl, 'dpb_revenue') and contains_substr(ddl, 'dpb_transactions')))))");
EXECUTE IMMEDIATE
  query;

Thanks in advance for any assistance or advice!

I've tried reading all the existing dataform documentation but haven't been able to identify anything relevant

0

There are 0 best solutions below