TypeError: decoding to str: need a bytes-like object, list found

1.9k Views Asked by At

I'm currently working on Invoice2data library and getting error. My template is ready but its giving me error when i pass invoice to it. please help me out.

here is my code:

import re
from invoice2data import extract_data
from invoice2data.extract.loader import read_templates
templates = read_templates('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Templates')
print(templates)
templates_str = re.escape(templates)
result = extract_data('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Invoice\\Invoice.pdf',templates)
print("\n")
print(result)

here is my error:

Traceback (most recent call last)
Input In [9], in <cell line: 6>()
      4 templates = read_templates('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Templates')
      5 print(templates)

----> 6 templates_str = re.escape(templates)
      7 result = extract_data('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Invoice\\Invoice.pdf',templates)
      8 print("\n")

File D:\Users\shash\anaconda3\lib\re.py:277, in escape(pattern)
    275     return pattern.translate(_special_chars_map)
    276 else:
--> 277     pattern = str(pattern, 'latin1')
    278     return pattern.translate(_special_chars_map).encode('latin1')

TypeError: decoding to str: need a bytes-like object, list found

here is my templates:

[InvoiceTemplate([('issuer', 'SAVEX TECHNOLOGIES PRIVATE LIMITED'), ('keywords', ['SAVEX TECHNOLOGIES PRIVATE LIMITED', 'WB-10/11, Renaissance logistics park,, Near vil']), ('fields', OrderedDict([('amount', 'TOTAL:\\s+(\\d+,\\d+\\.\\d\\d)'), ('date', 'Invoice Date:\\s+(\\d{1,2}\\.\\d{1,2}\\.\\d{4})'), ('Order_date', 'Order Date:\\s+(\\d{1,2}\\.\\d{1,2}\\.\\d{4})'), ('invoice_number', 'Invoice Number:\\s+(\\w{3}\\d{1,8})')])), ('options', OrderedDict([('remove_whitespace', False), ('currency', 'INR'), ('date_formats', ['%d/%m/%Y']), ('languages', ['en'])])), ('template_name', 'Invoice.yml'), ('exclude_keywords', [])])]


1

There are 1 best solutions below

2
Shahid Khan On

First I tried to comment, but I don't have 50 reputations so it won't allow me to comment.

I face a similar issue in one of my projects, then I add encoding attr and that resolve my issue.

you can also check this, I hope it works for you.

templates = read_templates('C:\\Users\\shash\\OneDrive\\Documents\\Desktop\\I\\Templates', encodings = 'utf-8')

kindly check both encoding and encodings.