i'm new to kafka and figuring out its behavior.
I have a kafka cluster that has three brokers in it. I have given 2GB for the cluster and my cluster disk storage reached 95%. So what i did was deleted the main topic which i used for testing. (This topic has replication factor of 3, min in sync replicas as 2, 8 partitions and retention time of 3 days) Main reason i deleted this topic is i always used this topic and every test data was produced to this topic. My intention was to free up the disk storage.(I thought when i delete the topic, all the persisted message from that topic will get removed so that i will get more disk space from my kafka cluster) When i deleted i noticed two things.
- One of the brokers disk usage went down. But other two brokers usage didn't change a bit.
- When i listed the topics in the cluster, deleted topics had a note infront of them saying "Marked for deletion"
What is the reason for above behaviors ?
Btw i have set delete.topic.enable = true and auto create topic also true in properties of Kafka brokers.
I think deleting a topic will not clear the disk space, if you deleted the topic, you can manually delete the index files and the data file of that topic partitions, but this is not a recommended way.
And I think the better solution is to update the cleanup policy to
deleteand reduce the retention time for the topic. Then Kafka will delete the data of the old segments older than that retention time for that topic. In this way, you can keep Kafka brokers clean from staled data.If you need to keep the data on the disk, you have to change the cleanup policy to
compactand it will compact the topic's old segments after the delete retention time. It will keep the latest message for a key for a partition and tombstone older message for that message key.