How to get the level of a docx heading in Python?

60 Views Asked by At

I need to get the level of the docx's heading and print it - i.e. Heading 2 France, Heading 2.1 Paris, Heading 2.1.1 Visit, Heading 6 Portugal, etc.

How could I get the levels?

import docx
import os

current_directory = os.getcwd()
file_name = 'geo.docx'
docx_path = os.path.join(current_directory, file_name)
document = docx.Document(docx_path)

def extract_headings(docx_path):
    style_names = []
    heading_names = []
    doc = docx.Document(docx_path)
    for paragraph in doc.paragraphs:
        if paragraph.style.name.startswith('Heading'):
            style_names.append(paragraph.style.name)
            heading_names.append(paragraph.text)
    return style_names, heading_names

style_names, heading_names = extract_headings(docx_path)

for style, heading in zip(style_names, heading_names):
    print(style, heading)

Expects: Heading 2 France, Heading 2.1 Paris, Heading 2.1.1 Visit, Heading 6 Portugal ...

0

There are 0 best solutions below