I am using Beautiful Soup to traverse some TEI XML that I've written for Peanuts comic strips. I'm trying to isolate certain features that are recorded in the div using the @ana attribute.
<text>
<body>
<head><emph>Peanuts</emph>, <date when="1971-10-01">1 October 1971</date></head>
<div type="panelGrp" xml:id="Peanuts1971-10-01" ana="#s-psych #s-outside">
...
</div>
</body>
</text>
I can isolate this particular div (the only one in each document) using the following.
def make_soup(xmlfile):
with open(xmlfile) as xml_file:
soup = BeautifulSoup(xml_file, 'lxml-xml')
return soup
div = soup.find('div')
Where I am stuck, however, is accessing the contents of @ana. In this case, the output should be #s-psych #s-outside.
I don't have your function, but I think you can pick the answer from my mockup:
Output: