Snowflake update statements locks the entire table and queues other update statements

771 Views Asked by At

We have a use case where we need to execute multiple update statements(each updating different set of rows) on the same table in snowflake without any latency caused by queuing time of the update queries. Currently, a single update statements takes about 1 min to execute and all the other concurrent update statements are locked (about 40 concurrent update statements) and are queued, so the total time including the wait time is around 1 hour but the expected time is around 2 mins( assuming all update statements execute at the same time and the size of warehouse supports 40 queries at the same time without any queueing). What is the best solution to avoid this lock time? We've considered the following two options :-

  1. Make changes in the application code to batch all the update statements and execute as one query - Not possible for our use case.
  2. Have a separate table for each client (each update statement, updates rows for different clients in the table) - This way, all the update statements will be executing in separate table and there won't be any locks.

Is the second approach the best way to go or is there any other workaround that would help us reduce the latency of the queueing time?

2

There are 2 best solutions below

1
Dinesh Kumar On

The scenario is expected to happen since Snowflake locks table during update.

Option 1 is ideal to scale in data model. But since you can't make it happen, you can go by option 2.

You can also put all the updates in one staging table and do upsert in bulk - Delete and Insert instead of update. Check if you can afford the delay.

But if you ask me, snowflake should not be used for atomic update. It has to be an upsert (delete and insert and that too in bulk). Atomic updates will have limitations. Try going for a row based store like Citus or MySQL if your usecase allows this.

0
simbo1905 On

Snowflake takes partition locks during most of the work (computing what it is going to do) and table locks for the commit:

https://community.snowflake.com/s/article/Understanding-Lock-Granularity-in-Snowflake

First, make your statements as fast as possible to hold the locks for as short as possible. For example, we define cluster keys that use business date first. We do this because we load data for today/yesterday into large tables with many years of data. If we join/filter without/with date the number of partitions used drops from > 1000 down to < 10.

If you are doing an UPDATE or a MERGE to speed things up you have to use the cluster keys on every join and every filter. This will ensure read/write the minimum number of partitions while holding partition locks.

Always run the queries manually in the UI, click on the query id, then look at the query statistics, to see how many partitions are used. To get things to be as fast as possible, our full process is:

  1. COPY from stage into a buffer table that has the same cluster keys as the target table. This is not a temp table in case we have crashes. It is just a regular table that is identical to the target table with the same cluster keys that we delete from at the end of the job. If we crash before the job completes then the pending work is left in the buffer table and is processed on the next job run.
  2. MERGE INTO target USING ( SELECT ... FROM buffer ) source where the SELECT dedups what is in the buffer to ensure not trying to merge identical rows as that will fail. We do that as crash/restart of our data load jobs mat very rarely result with two files in the stage holding the identical data. We dudup on the metadata$filename which is the name of the file in the stage.
  3. DELETE FROM buffer WHERE so that we are ready for the next run. As we run the data load jobs frequently during the day the buffer table is never very large. So the dudup logic of the SELECT in the MERGE is never slow.

We had to be null safe on the merge into the target table:

MERGE ... 
ON 1=1 
  AND ( ( source.x is null and target.x is null ) OR (source.x = target.x) )
  AND ( ( source.y is null and target.y is null ) OR (source.y = target.y) )

We found it best to overuse the cluster keys and look on the UI to check the number of partitions being used is as small as possible to make it as fast as possible.