I have following triples in my turtle file on which I would like to apply a regex to filter out the last triples.
ttl file:
### https://rmswi/#/lists/100000000001
<https://rmswi/#/lists/100000000001> rdf:type owl:Class ;
rdfs:label "Age Range" .
### https://rmswi/#/lists/100000000001/terms/100000000029
<https://rmswi/#/lists/100000000001/terms/100000000029> rdf:type owl:Class ;
rdfs:subClassOf <https://rmswi/#/lists/100000000001> ;
<http://purl.obolibrary.org/obo/IAO_0000115> "Any human before birth." ;
<http://www.geneontology.org/formats/oboInOwl#hasExactSynonym> "Fetus" ,
"Foetus" ,
"In utero" ;
rdfs:label "In utero" ;
<https://ontology/properties/Domain> "https://rmswi/#/lists/100000000004/terms/100000000012" ;
<https://ontology/properties/Term_Status> "CURRENT" .
I have following SPARQL query to extract all the triples (with predicate as rdfs:label).
query = """
prefix oboInOwl: <http://www.geneontology.org/formats/oboInOwl#>
prefix obo: <http://purl.obolibrary.org/obo/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?s ?p ?o
WHERE {
?s rdfs:label ?o ;
FILTER (strstarts(str(?s), 'https://rmswi/#/lists/')) .
FILTER REGEX(?s, 'https:\/\/#\/lists\/\d+$' )
}
"""
qres = g.query(query)
for row in qres:
print (row)
The expected output is:
ttl file:
(rdflib.term.URIRef('https://#/lists/100000000001'), None, rdflib.term.Literal('Age Range'))
Any help is highly appreciated