''BuiltIn' is not a supported encoding name

158 Views Asked by At

I am converting a PDF to Text using 'iText.PdfTextExtractor' and I am receiving this error ONLY on some of the pdf pages I am trying to convert:

'BuiltIn' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method. (Parameter 'name')

I've tried adding the following code before opening the file stream, but I am still receiving the error:

System.Text.EncodingProvider provider = System.Text.CodePagesEncodingProvider.Instance;
Encoding.RegisterProvider(provider);

Here is my code:

public void ExtractFromPdf(string pdfFile, ClaimInfo claimInfo, string memberId)
{
    System.Text.EncodingProvider provider = System.Text.CodePagesEncodingProvider.Instance;
    Encoding.RegisterProvider(provider);

    PdfReader pdfRead = new PdfReader(pdfFile);
    PdfDocument pdfDoc = new PdfDocument(pdfRead);

    for (int page = 1; page < pdfDoc.GetNumberOfPages(); page++)
    {
        string convertToText = PdfToText(pdfDoc, page);
    }
}

private string PdfToText(PdfDocument pdfDoc, int pageNo)
{
    ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
    return PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(pageNo), strategy);
}

The error occurs at return PdfTextExtractor.GetTextFromPage(pdfDoc.GetPage(pageNo), strategy);

I've tried looking everywhere and it seems that 'BuiltIn' is a built in encoding name that I don't know how to find. Any ideas?

0

There are 0 best solutions below