I'm trying to query dbpedia on a local installation of Virtuoso (a little over a billion triples), and would like to be able to read the entire thing in pages of about 1000 triples at a time. The following query seemed promising:
SELECT *
WHERE {
?s ?o ?p.
}
LIMIT 1000
OFFSET 10000000
until I realized that queries of this type run in time proportional to the OFFSET value.
Looking into the query plan it seems that queries such as this get translated into SQL that looks like this:
SELECT TOP 100000000, 1 __id2in ( "s_7_2_t0"."S") AS "s",
__id2in ( "s_7_2_t0"."P") AS "o",
__ro2sq ( "s_7_2_t0"."O") AS "p"
FROM DB.DBA.RDF_QUAD AS "s_7_2_t0"
OPTION (QUIETCAST)
which confirms my observation.
Is it possible to run such queries in constant time, either in SPARQL or directly in SQL on the SQL table? Since it's all SQL under the hood I had hoped that it would be a straightforward matter of writing the corresponding SQL query but for some reason the query select * from DB.DBA.RDF_QUAD limit 1; fails with the error syntax error which leaves me more confused than ever.