Incremental Checkpoint Data Size ( Flink)

327 Views Asked by At

we are using Flink and we have enabled the incremental checkpointing by setting state.backend.incremental=true. We are using rocksdb as state backend. With incremental checkpointing, we expect "Checkpointed Data Size" to be smaller than "Full Checkpoint Data Size" on the Flink UI but it's the same. As you can see the screenshot. ScreenShot

Please let me know someone has an idea about it.

"Checkpointed Data Size" should be smaller than "Full Checkpoint Data Size"

1

There are 1 best solutions below

0
On

With the RocksDB state backend, incremental checkpoints are implemented by taking advantage of how RocksDB works internally. What this means is that an incremental checkpoint is taken by only copying (to the distributed file system where the checkpoints are stored) new SST files that were created since the previous checkpoint.

Another aspect of this is that Flink checkpoints small chunks of state (by default, any chunk less than 20k bytes in size) in the root metadata file for the checkpoint, rather than in SST files. This behavior is controlled by state.storage.fs.memory-threshold.

You have very little state, so it's all going into the metadata file. So in your case there's no practical difference between full checkpointing and incremental checkpointing.

If you were to reduce state.storage.fs.memory-threshold to 0, then you would see a change -- but probably for the worse. Then the incremental checkpoints might be larger than the full ones, because then Flink would be copying over the uncompacted SST files. If your state is changing from one checkpoint to another, the incremental checkpoints would include both the old and new state, up until RocksDB does a compaction. Incremental checkpoints are only helpful when you have a large amount of state.

For a deeper explaination, see Improvements for large state and recovery in Flink, which is a Flink Forward talk by Stefan Richter, who did the implementation.