I'm using GraphDB version 10.2.0. I am encountering a performance issue with GraphDB that I hope you can help me with. Here's the situation:
Problem Description:
When I initially run a specific query on wikidata after restarting the GraphDB Docker container, it takes an exceptionally long time, approximately 5 hours, to complete. However, when I rerun the same query a second time, it completes in a more reasonable time frame, around 9000 milliseconds, which is what I expect it to take.
My Question: I'm perplexed by why the query takes 5 hours on the first run and only 9000 milliseconds on the second run. Can anyone shed some light on why this might be happening and how can i resolve this issue?
I've taken several steps to diagnose and address this issue:
Docker Container Restart: Initially, I restart the GraphDB Docker container to clear the cache.
Query Execution: After restarting the Docker container, I execute a specific query against my wikidata dataset.
Expected Outcome:
I expected that the query would execute efficiently, taking approximately 9000 milliseconds, as it does on subsequent runs after the Docker container restart.
Actual Result:
Surprisingly, on the first run after Docker container restart, the query takes an unusually long time, around 5 hours, to complete. Subsequent runs of the same query (after the first one) consistently execute in the expected timeframe, around 9000 milliseconds.
I'm using a command-line approach. However, I've provided the equivalent docker-compose.yml configuration that corresponds to my setup.
services:
graphdb:
image: khaller/graphdb-free
ports:
- "7200:7200"
volumes:
- <volume_contains_data>:/root/graphdb-import
- <data_volume>:/opt/graphdb/data
environment:
GRAPHDB_WORKBENCH_IMPORTDIRECTORY: <volume_contains_data>
GRAPHDB_HEAP_SIZE: 128g
JAVA_TOOL_OPTIONS: -Xmx64g