Problem
What are the next steps to complete JanusGraph indices?
I've already tried following line-by-line of the JanusGraph docs;
And tried using closer to source commits janusGraph instead of janusGraphManagement;
And tried suggestions provided so far; which none have worked yet.
[o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
Trials
Try 1
Tried to recreate what I initially made, and a part of the suggestion to revisit JanusGraph docs.
The issue with this one is that logs said I didn't have indexes still.
This is what initially got me checking if I was needing janusGraph.commit() instead.
Logs
...
2023-05-15 09:58:05,984 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-15T14:58:05.984686Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@48b4a043
2023-05-15 09:58:06,047 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-15 09:58:06,270 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:58:06,275 [INFO] [Main.main] :: drop g.V().hasLabel("entity").count().next(): 0
2023-05-15 09:58:07,020 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:58:07,033 [INFO] [Main.main] :: addVertex g.V().hasLabel("entity").count().next(): 1
2023-05-15 09:58:07,046 [INFO] [o.j.g.d.m.GraphIndexStatusWatcher.main] :: Some key(s) on index _id do not currently have status(es) [REGISTERED]: _id=ENABLED
...
[o.j.g.d.m.GraphIndexStatusWatcher.main] :: Some key(s) on index _id do not currently have status(es) [REGISTERED]: _id=ENABLED
2023-05-15 09:59:07,520 [INFO] [o.j.g.d.m.GraphIndexStatusWatcher.main] :: Timed out (PT1M) while waiting for index _id to converge on status(es) [REGISTERED]
2023-05-15 09:59:07,522 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:59:07,531 [INFO] [Main.main] :: awaitGraphIndexStatus g.V().hasLabel("entity").count().next(): 1
2023-05-15 09:59:07,627 [INFO] [o.j.g.o.j.IndexRepairJob.Thread-68] :: Index _id metrics: success-tx: 2 doc-updates: 0 succeeded: 1
...
2023-05-15 09:59:08,297 [INFO] [o.j.g.d.m.ManagementSystem.Thread-51] :: Index update job successful for [_id]
2023-05-15 09:59:08,299 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:59:08,304 [INFO] [Main.main] :: updateIndex g.V().hasLabel("entity").count().next(): 1
Process finished with exit code 0
Code
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.schema.JanusGraphIndex;
import org.janusgraph.core.schema.JanusGraphManagement;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.graphdb.database.management.GraphIndexStatusReport;
import org.janusgraph.graphdb.database.management.ManagementSystem;
import java.util.concurrent.ExecutionException;
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
if (!janusGraphManagement.containsGraphIndex("_id"))
janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
janusGraphManagement.commit();
janusGraphManagement = janusGraph.openManagement();
GraphIndexStatusReport report = ManagementSystem.awaitGraphIndexStatus(janusGraph, "_id").call();
logger.info("awaitGraphIndexStatus g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphIndex test = janusGraphManagement.getGraphIndex("_id");
janusGraphManagement.updateIndex(test, SchemaAction.REINDEX).get();
janusGraphManagement.commit();
logger.info("updateIndex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
Try 2
JanusGraph is telling me I cannot enable index because its installed.
Which JanusGraph should be telling me its registered instead.
Which shouldn't even matter if I'm just doing a basic awaitGraphIndexStatus() like the example.
Which wouldn't matter if I knew how to create the index before the vertices existed.
Logs
2023-05-12 12:50:04,641 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-12 12:50:04,737 [INFO] [c.d.o.d.i.c.DefaultMavenCoordinates.main] :: DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.15.0
2023-05-12 12:50:05,236 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-12 12:50:05,518 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=4fe054a)=null; please provide the correct local DC, or check your contact points
2023-05-12 12:50:05,763 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] :: Generated unique-instance-id=c0a8563c21084-rmt-lap-win201
2023-05-12 12:50:05,786 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-12 12:50:05,823 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-12 12:50:05,861 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=5ac9b054)=null; please provide the correct local DC, or check your contact points
2023-05-12 12:50:05,880 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] :: Initiated fixed thread pool of size 40
2023-05-12 12:50:05,998 [INFO] [o.j.g.d.StandardJanusGraph.main] :: Gremlin script evaluation is disabled
2023-05-12 12:50:06,026 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-12T17:50:06.025476Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@48b4a043
2023-05-12 12:50:06,086 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:06,344 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:06,353 [INFO] [Main.main] :: drop g.V().count().next(): 0
2023-05-12 12:50:07,107 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:07,110 [INFO] [Main.main] :: addVertex g.V().count().next(): 1
2023-05-12 12:50:08,197 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:08,202 [INFO] [Main.main] :: buildIndex g.V().count().next(): 1
2023-05-12 12:50:08,209 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:08,212 [INFO] [Main.main] :: updateIndex g.V().count().next(): 1
Exception in thread "main" java.lang.IllegalArgumentException: Update action [ENABLE_INDEX] cannot be invoked for index with status [INSTALLED]
at org.janusgraph.core.schema.SchemaAction.isApplicableStatus(SchemaAction.java:85)
at org.janusgraph.graphdb.database.management.ManagementSystem.updateIndex(ManagementSystem.java:864)
at org.janusgraph.graphdb.database.management.ManagementSystem.updateIndex(ManagementSystem.java:845)
at Test12.main(Test12.java:39)
Code
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("buildIndex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraphManagement.updateIndex(janusGraphManagement.getGraphIndex("_id"), SchemaAction.REGISTER_INDEX).get();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("REGISTER_INDEX g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraphManagement.updateIndex(janusGraphManagement.getGraphIndex("_id"), SchemaAction.ENABLE_INDEX).get();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("ENABLE_INDEX g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
ManagementSystem.awaitGraphIndexStatus(janusGraph, "_id").call();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("awaitGraphIndexStatus g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
Try 3
Suggestion to revisit JanusGraph docs, I no longer think the output was a misnomer.
Cleaned up to what it is now.
Logs
2023-05-15 11:02:15,142 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-15 11:02:15,230 [INFO] [c.d.o.d.i.c.DefaultMavenCoordinates.main] :: DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.15.0
2023-05-15 11:02:15,853 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-15 11:02:16,152 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=705b0e14)=null; please provide the correct local DC, or check your contact points
2023-05-15 11:02:16,410 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] :: Generated unique-instance-id=c0a856493416-rmt-lap-win201
2023-05-15 11:02:16,433 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-15 11:02:16,472 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-15 11:02:16,522 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=1b473c95)=null; please provide the correct local DC, or check your contact points
2023-05-15 11:02:16,548 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] :: Initiated fixed thread pool of size 40
2023-05-15 11:02:16,678 [INFO] [o.j.g.d.StandardJanusGraph.main] :: Gremlin script evaluation is disabled
2023-05-15 11:02:16,704 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-15T16:02:16.704454Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@74ea46e2
2023-05-15 11:02:16,762 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-15 11:02:16,989 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:16,992 [INFO] [Main.main] :: drop g.V().hasLabel("entity").count().next(): 0
2023-05-15 11:02:17,010 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,014 [INFO] [Main.main] :: makePropertyKey g.V().hasLabel("entity").count().next(): 0
2023-05-15 11:02:17,748 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,759 [INFO] [Main.main] :: addVertex g.V().hasLabel("entity").count().next(): 1
2023-05-15 11:02:17,761 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,764 [INFO] [Main.main] :: updateIndex g.V().hasLabel("entity").count().next(): 1
Process finished with exit code 0
Code
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
if (!janusGraphManagement.containsPropertyKey("_id"))
janusGraphManagement.makePropertyKey("_id").dataType(String.class).make();
janusGraphManagement.commit();
logger.info("makePropertyKey g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
Final issue was understanding that
propertyKeyand itsvaluehave to be supplied.Thanks again Stephen Mallette! He provided a great resource from Jason Plurad if you extrapolate more of what you read.
Switched back to
inmemorylike Kelvin Lawrence suggested earlier, since I'm not using large memory to get this concept down.