I am trying to configure persistent HTTP caching using the org.apache.http.impl.client.cache.CachingHttpClients builder. However, when I configure a cache directory, the cache never seems to be read back from disk.
I tried to setup persistent caching using setCacheDir, i.e.,
CachingHttpClients.custom()
.setCacheDir(cacheDir)
.setDeleteCache(false)
.build();
(see below for a complete example)
The behaviour I'm seeing:
- For each request I see cache entries being written to
cacheDirof the form1703170640727.0000000000000001-997b0365.User.-url-path. So far so good. - Subsequent requests to the same URL get a cache hit. They are fast. Still good.
- I restart my application.
- Making a request to the same URL again results in a cache miss.
It seems that the cache entries that were written to disk are not being picked up after a restart, and I haven't been able to find a way to do so.
How do I initialize Apache's HTTP cache, so caching persists after restarts?
Minimal reproducible example. Running this multiple times results in a "Cache miss" every time, although there are cache entries being written to disk. I would expect reruns to use the cache that was written to disk. Note that I do see a cache hit if I perform two requests to the same URL within the same run.
File cacheDir = Path.of(System.getProperty("java.io.tmpdir")).resolve("my-http-cache").toFile();
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new RuntimeException("Could not create cache directory " + cacheDir + ".");
}
try (var client = CachingHttpClients.custom()
.setCacheDir(cacheDir)
.setDeleteCache(false)
.useSystemProperties()
.build()) {
HttpCacheContext context = HttpCacheContext.create();
CloseableHttpResponse response = client.execute(new HttpGet("https://api.github.com/repos/finos/common-domain-model"), context);
CacheResponseStatus responseStatus = context.getCacheResponseStatus();
switch (responseStatus) {
case CACHE_HIT:
System.out.println("Cache hit!");
break;
case CACHE_MODULE_RESPONSE:
System.out.println("The response was generated directly by the caching module");
break;
case CACHE_MISS:
System.out.println("Cache miss!");
break;
case VALIDATED:
System.out.println("Cache hit after validation");
break;
}
}
Apache's HTTP caching will keep track of a cache entry for each eligible HTTP response. This cache entry points to a certain abstract "resource" object, which holds the cached response. By using
CachingHttpClients.custom().setCacheDir(cacheDir), this resource will be a file, i.e., responses will be saved to disk, rather than kept in memory, which saves on memory usage. However, the cache entries themselves are still kept in-memory, so they will not survive a restart.The following implementation could be used to persist cache entries as well:
Usage: