I have an rdf file with the following content:
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>
<rdf:Description rdf:about="http://someurl.com/def/elementtype/projectState">
<rdfs:domain rdf:nodeID="projectState_0" />
</rdf:Description>
</rdf:RDF>
which is parsed by the following code:
import rdflib
g = rdflib.Graph()
with open("problem/err.rdf", 'r', encoding='UTF-8') as fp:
g.load(fp, format='application/rdf+xml')
for s, p, o in g:
print(f"subject:{s}")
print(f"predicate:{p}")
print(f"object:{o}")
print()
I'd expect the predicate to expose the attribute nodeID but I did not find a way to get it. The documentation also doesn't acknowledge xml attributes on BNodes (blank nodes without content).
Blank node subjects generally aren't promised to be preserved when importing graphs (some graph databases like GraphDB do offer to option to). When I run the code the first time, the output is
When I run it a second time, the output is
So regarding the question of exposing the nodeId, it is-it's just not respecting the identifier that you gave to it. See more information with this issue.
I would suggest
i. Using a different graph database that supports blank node preservation
ii. Use an XML parser
iii. Elevate the blank node to an
rdf:resource