Snowflake external table partition include the field name and value

41 Views Asked by At

In the S3 storage, the partition is stored in the format "partition_mon_yr=202203." Upon ingestion by Glue, Glue interprets "partition_mon_yr" as the field name and "202203" as its corresponding value. This discrepancy arises because Snowflake external files construct the partition field from the metadata field value, resulting in a combined representation like "partition_mon_yr=202203."

Is it a good idea to extract the substring and store the "202203" value in the partition_mon_yr column in Snowflake? so that if the same data is used in Glue and Snowflake both will display in the same way. But my question arises, If I store whether partitioning against S3 will work or snowflake partitioning on an external table will not work with S3.

In Snowflake 
Actual partition value

CREATE EXTERNAL TABLE et1(
  partition_mon_yr string AS SPLIT_PART(metadata$filename, '/', 3)
  col1 varchar AS (value:col1::varchar)  
  col2 varchar AS (value:col2::varchar))
  PARTITION BY (partition_mon_yr)
  LOCATION=@s1/logs/
  AUTO_REFRESH = true
  FILE_FORMAT = (TYPE = PARQUET);

partition_mon_yr contains the value partition_mon_yr=202203

Extract the YYYYMO value:

CREATE EXTERNAL TABLE et1(
  partition_mon_yr string AS SPLIT_PART(SPLIT_PART(metadata$filename, '/', 3),'=',2),
  col1 varchar AS (value:col1::varchar)  
  col2 varchar AS (value:col2::varchar))
  PARTITION BY (partition_mon_yr)
  LOCATION=@s1/logs/
  AUTO_REFRESH = true
  FILE_FORMAT = (TYPE = PARQUET);

partition_mon_yr contains the value 202203.
0

There are 0 best solutions below