Run tesseract.js OCR onFileUpload and extract text

731 Views Asked by At

I think the title is self-explanatory. What am I doing wrong below? What I want to achieve is getting the text out of a photo, right after the user selects a photo. The error I get is:

createWorker.js:173 Uncaught Error: RuntimeError: null function or function signature mismatch

What am I doing wrong?

const { createWorker } = require("tesseract.js");

const [file,setFile] = useState();
  const worker = createWorker({
    logger: (m) => console.log(m),
  });

  const doOCR = async (image) => {
    await worker.load();
    await worker.loadLanguage("eng");
    await worker.initialize("eng");
    const {
      data: { text },
    } = await worker.recognize(image);
    // } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
    console.log(text);
    setOcr(text);
  };

  const [ocr, setOcr] = useState("Recognizing...");
  useEffect(() => {
    file ? doOCR(file) : console.log('no file selected yet!');
  }, [file]);

  const getFile = (e) => {
    console.log("Upload event:", e);
    if (e) {
      if (Array.isArray(e)) setFile(e[0]);
      setFile(e)
    }
  }

....

<p>{ocr}</p> /* this only displays "Recognizing..." */
<Form.Item
  name="uploadedPhoto"
  label="Upload your photo scan"
  getValueFromEvent={getFile}
  // rules={[{ required: true }]}>
  <Input type="file" 
  // onChange={onImageUpload}
/>
</Form.Item>
1

There are 1 best solutions below

0
On

Solved it by doing it like this instead of the above (I applied the function to the onChange of the Input itself, not the Form.Item element)

  const handleFileSelected = (e) => {
    const files = Array.from(e.target.files);
    setFile(files[0]);
  };
<Input type="file" onChange={handleFileSelected} />