I am trying out the new clickhouse lightweight deletes, which deletes rows on the next merge but asynchronously 'marks' them immediately so that they are not shown in subsequent queries.

The guide i am following is here: https://clickhouse.com/docs/en/sql-reference/statements/delete/

But this doesn't seem to be happening as expected. After deletion it takes about 2-3 minutes before my returned queries are excluded from the results. am I missing something here?

E.g.

I have a collapsingMergeTree table called 'tests' with the following three rows:

Id name
syGGJVGETbzKMkayoYYaAg kieren
wFhZdsdjf1xmcHGqK1CQQf mike
abfZrhYkiafg7qr9jAwseG peter

I attempt to run

DELETE FROM test WHERE Id = 'syGGJVGETbzKMkayoYYaAg' SETTINGS allow_experimental_lightweight_delete=1;

// OR

SET allow_experimental_lightweight_delete=1;
DELETE FROM test WHERE Id = 'syGGJVGETbzKMkayoYYaAg' 

However when I run the query I still get the deleted record in results, until about 5 minutes later.

I am using the clickhouse docker image (clickhouse/clickhouse-server) for my tests locally.

Thanks in advance for your help.

--- what did you try ---

Tried the documents code and expected that the results would be hidden immediately and deleted on the background as the page indicates so.

Please let me know if I am doing anything wrong or missing something

1

There are 1 best solutions below

0
Denny Crane On

DELETE FROM test is async by default. Not only process of deletion but DELETE itself as well.

you can use mutations_sync=2:

DELETE FROM test 
WHERE Id = 'syGGJVGETbzKMkayoYYaAg' 
SETTINGS allow_experimental_lightweight_delete=1, mutations_sync=2;

It makes your delete synchronous.

( this behaviour was changed today BTW https://github.com/ClickHouse/ClickHouse/pull/44718 )

So currently you run DELETE FROM test ... it returns control immediately and does nothing!!! except it creates a mutation. This mutation asynchronously in the background changes the invisible column _deleted for rows matching WHERE condition, this process can take seconds, minutes, hours, days. After that your selects will not see these rows. After that eventually OR NEVER the marked rows will be deleted during merges, this can take months.