I have a use case where I store logs in ClickHouse database partitioned by day. Once logs are written they will not be updated. I want to monitor the integrity of the data files to validate they are not modified. A simple solution is to calculate md5 hash of all files of that partition. But ClickHouse performs background merges based on some kinds of "heuristic" algorithms. So the question is if I stop inserting rows into one partition and I run OPTIMIZE TABLE .. FINAL on that partition, is it guaranteed that no merge will happen for that partition anymore? If so, I think I can use this md5 hash approach.
The intention of this use case is to check for hypothetical “hacker” that may intentionally modify the files.