How do you time travel across groups of tables in delta/iceberg?

46 Views Asked by At

I am new to these table formats that allow time travel (Dela/iceberg/lance), but it's unclear to me how to manage dependencies across tables.

Suppose you have tables A, B, and C, where C builds on B which builds on A.

A -> B -> C

What are some good patterns to check what version of A generated C?

I could look at the timestamps of the versions, but you quickly end on checking that the timestamps be increasing in complicated ways.

I looked to see what is the minimum version of A that generates a monotonic increasing version timestamp up to C

1

There are 1 best solutions below

4
Atif On

Iceberg does provide time-travel by basically using the snapshots of the table as long as that snapshot exist in the system and data files related to these snapshots exist. There are basically two ways to do time travel.

  1. FOR VERSION AS OF syntax where you can pass the snapshot_id. Snapshot ID is stored in snaphots metadata table
SELECT * FROM prod.db.table VERSION AS OF 10963874102873;
  1. FOR TIMESTAMP AS OF syntax where you can pass the timestamp at which how the data was.

    SELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';