As the question says, I am running the dbt snapshot command and a few of my snapshots are not working because DBT is not recognizing the surrogate key id that I created. My snapshots are all built the same way and so are the base views that they are based off of. Here is an example of a snapshot that is not working because it does not recognize the surrogate key:

{% snapshot example_snapshot %}
    {{ config(
        target_schema = 'snapshots',
        unique_key = 'id',
        strategy = 'check',
        check_cols = 'all'
    ) }}

    SELECT
        *
    FROM
        {{ ref('base_example') }}
{% endsnapshot %}

followed by an example of the base view it is referencing:

WITH src AS (
    SELECT
        *
    FROM
        {{ source(
            'tomato',
            'potato'
        ) }}
),
cleaned AS (
    SELECT
        *,
        {{ dbt_utils.surrogate_key(['column', 'another_column', 'yet_another_column']) }} AS id
    FROM
        src
)
SELECT 
    *
FROM 
    cleaned

Keep in mind that when I run the command dbt run -m [base_example] it produces a view where I can see the hash generated as a surrogate key. The issue is only when I run dbt snapshot. In fact, running dbt snapshot --select [example_snapshot] to only run one snapshot at a time doesn't give me any errors for any of the snapshots. The most confusing part: I have one base view and snapshot of that base view configured exactly as the other 3 that are not working, yet it recognizes the surrogate key when creating a snapshot. I'm seriously stumped, any help would be appreciated.

1

There are 1 best solutions below

0
Nick S. On

In my experience, snapshots can get a little wonky when dependent on a dbt model (via ref('base_example')) and not a source. It's advisable, though without great explanation in the docs as to why, to select from the source.

Since your transformation is only about adding a surrogate key based on three columns in the source, I wonder if you could just stick the transformation in the unique_key parameter, á la (here thinking in Redshift land and without testing):

{% snapshot example_snapshot %}
    {{ config(
        target_schema = 'snapshots',
        unique_key = 'md5(column, another_column, yet_another_column)',
        strategy = 'check',
        check_cols = 'all'
    ) }}

    SELECT
        *
    FROM
        {{ source('tomato', 'potato') }}
{% endsnapshot %}