I have a .ttl file. I want to extract all distinct predicates from it. I am using Apache-jena. For this, I have used this SPARQL command:
"SELECT DISTINCT ?property WHERE {" +
" ?s ?property ?o ."
+ "}";
And I get a result, something like this:
<http://something.dk/ontology/business/name
<http://something.dk/ontology/business/id
What I want is to get rid of this prefix,
<http://something.dk/ontology/business/
and get only name and id as predicates which will be used to get their object value accordingly. For now, I'm doing this:
"prefix j.0`<http://something.dk/ontology/business/>" +
"select ?a ?b where {" +
" ?Name j.0:name ?a ."
+ " ?Name j.0:id ?b ."
+ "}";
But this is not efficient as there might be some other properties. How can I get all predicates from the model without prefixes and use those predicates to get the object values?
Your predicate URIs all contain the word "ontology"... do you actually have an ontology? Do you understand that an ontology is different from just any free-form linked data triples? Where are the class
<http://something.dk/ontology/business/village>and the predicate<http://something.dk/ontology/business/population>defined?In other words, for these data triples:
I would expect to see at least the following minimal ontology:
If you load both the data and the ontology into a triplestore like Jena Fuseki, this query:
Returns this result:
If you're using one of Jena's other ways of accessing RDF content, you could use the same query, but you would have to use a different method for combining the data triples and the triples from the ontology.
@AKSW's comment is one way of doing a sub-string removal for this particular task. Specifically, we are removing the content of the default
:prefix from every URI. A more general function isreplace().I have never seen @AKSW give bad advice, but I would really urge you to get into the habit of using as proper ontology, not a string manipulation workaround.
@Stanislav also knows his stuff. It looks to me like
afn:localname()is a convenience function, so you don't have to type out this regular expressionreplacement:REPLACE(STR(?x), "^(.*)(/|#)([^#/]*)$", "$3")A fun exercise would be obtaining or synthesizing many thousands of triples like you provided and timing the performance of these three different labeling methods.
Also, with an ontology you could set the domain and ranges for your datatype properties, like
population. That should take anxsd:integer, not an untyped string in my opinion.