How to identity header and its sub-headers from the pdf using Azure Document intelligence Layout model?

126 Views Asked by At

I am using azure document Intelligence layout model to extract the information from the pdf. I am using python sdk to call the API and extracting the relevant information from the json output. Extracted information I am writing in the csv file as one column becomes header and second column is the section of the respective header.

I can identify the headers from the role key of paragraphs collection using JSON ouput where model gives as Sectionheading.

However,I am not sure how I can differentiate between the headers and its subheaders so I can combine headers and its subheders together

Thanks in advance .

0

There are 0 best solutions below