I am inserting JSON strings into Redis 7.2.4 from a Beam batch job running on Flink. I am using the Java SDK, using Redisson client of version 3.27.1. Redis is deployed in Sentinel mode. Every night, a batch job is processing approximately 80 GB of data into Redis (size in Redis). We faced memory issues and used a "presharding" process to make profit of Redis memory optimization for hashes as described here.
A key could be item1 and the associated field 23. The value of that field would be "{\"c\":\"IN\",\"pr\":1,\"l\":\"en\",\"adf\":0}". Values get replaced. Keys and fields, once inserted, are not deleted, except a tiny amout.
After a lot of tweaking, we managed to get an entire batch of data into Redis. The issue that we now face is that we cannot process a whole new batch of data. When inserting it, memory increases until it reaches the eviction limit, and eviction starts.
Firstly, we would like not to evict data. I know that for this reason Redis may not be the ideal choice, but I have to make it work. We could temporarily accept evictions as it may only very marginally degrade the quality of some of our services, but now another issue arised. As we cross the eviction limit, eviction starts and 1) writes get very slow (less than a tenth of the initial speed) and 2) nodes start to fail, potentially endangering the services requesting data from Redis.
What I do not understand is that 99.9% of the data is similar from one batch to another one, so most operations are only replacement of preexisting hash fields. I know deletion of a key does not free memory. Is that the same of replacement of hash fields?
I have searched the web for the hash field Redis behavior from a memory point of view, unfortunately no clear answer was found.
Thanks in advance.