I'm currently working on Invoice2data library and getting error. My template is ready but its giving me error when i pass invoice to it. please help me out.
here is my code:
import re
from invoice2data import extract_data
from invoice2data.extract.loader import read_templates
templates = read_templates('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Templates')
print(templates)
templates_str = re.escape(templates)
result = extract_data('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Invoice\\Invoice.pdf',templates)
print("\n")
print(result)
here is my error:
Traceback (most recent call last)
Input In [9], in <cell line: 6>()
4 templates = read_templates('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Templates')
5 print(templates)
----> 6 templates_str = re.escape(templates)
7 result = extract_data('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Invoice\\Invoice.pdf',templates)
8 print("\n")
File D:\Users\shash\anaconda3\lib\re.py:277, in escape(pattern)
275 return pattern.translate(_special_chars_map)
276 else:
--> 277 pattern = str(pattern, 'latin1')
278 return pattern.translate(_special_chars_map).encode('latin1')
TypeError: decoding to str: need a bytes-like object, list found
here is my templates:
[InvoiceTemplate([('issuer', 'SAVEX TECHNOLOGIES PRIVATE LIMITED'), ('keywords', ['SAVEX TECHNOLOGIES PRIVATE LIMITED', 'WB-10/11, Renaissance logistics park,, Near vil']), ('fields', OrderedDict([('amount', 'TOTAL:\\s+(\\d+,\\d+\\.\\d\\d)'), ('date', 'Invoice Date:\\s+(\\d{1,2}\\.\\d{1,2}\\.\\d{4})'), ('Order_date', 'Order Date:\\s+(\\d{1,2}\\.\\d{1,2}\\.\\d{4})'), ('invoice_number', 'Invoice Number:\\s+(\\w{3}\\d{1,8})')])), ('options', OrderedDict([('remove_whitespace', False), ('currency', 'INR'), ('date_formats', ['%d/%m/%Y']), ('languages', ['en'])])), ('template_name', 'Invoice.yml'), ('exclude_keywords', [])])]
First I tried to comment, but I don't have 50 reputations so it won't allow me to comment.
I face a similar issue in one of my projects, then I add encoding attr and that resolve my issue.
you can also check this, I hope it works for you.
kindly check both encoding and encodings.