I have a rtf document, and I need to output the entire text from it, but the text is divided into blocks, and there are actually no characters there. when trying to use standard python libraries, characters in the wrong encoding are output, and when utf-8 encoding is set, the program crashesexample of a block
def extract_text_from_rtf_file(rtf_file_path):
with open(rtf_file_path, 'r', encoding='latin-1') as file:
rtf_content = file.read()
plain_text = rtf_content.replace('\\', '').replace('{', '').replace('}', '')
return plain_text
rtf_file_path = 'D:/xxxxxxxxxxxxxxxxxxxxxx.rtf'
rtf_text = extract_text_from_rtf_file(rtf_file_path)
print(rtf_text)