I've installed tesseract on mac (ventura). When i run it in terminal it works fine. when I run the html php code using the apache2 server located in the
'/usr/local/' path
it works. I'm not sure about the one in the
'/private/' path, if that makes sense.
My issue comes when I try to run tesseract on xampp!
On mac I first get the error:
Error! The command "tesseract" was not found. Make sure you have Tesseract OCR installed on your system: https://github.com/tesseract-ocr/tesseract The current $PATH is /usr/bin:/bin:/usr/sbin:/sbin
If I change the execution path i get the following error:
[01-Mar-2024 14:33:41 Europe/Berlin] PHP Fatal error: Uncaught thiagoalessio\TesseractOCR\UnsuccessfulCommandException: Error! The command did not produce any output.
Generated command:
"/usr/local/Cellar/tesseract/5.3.4_1/bin/tesseract" "uploads/test.png" "/var/folders/ct/4574d7l95d71vhc0bf_mj7rr0000gn/T/" -c "tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ" -l eng
Returned message:
dyld[2171]: Symbol not found: _WebPMalloc
Referenced from: <405E0AB2-EB03-3BCF-BA09-6CAFB225680E> /usr/local/Cellar/webp/1.3.2/lib/libwebpmux.3.0.13.dylib
Expected in: <4D95CAC6-C8B4-39C0-9B98-B286ABEB684E> /Applications/XAMPP/xamppfiles/lib/libwebp.7.dylib in /Applications/XAMPP/xamppfiles/htdocs/texte/vendor/thiagoalessio/tesseract_ocr/src/FriendlyErrors.php:66
My guess is that it has something to do with (1) either php xampp executing commands because i've also tried executing tesseract commands in php and nothing works. Or, (2) the environment path - I'm not sure how it works so i can't properly explain it.
The code that I'm using is:
<?php
$fileRead = '';
use thiagoalessio\TesseractOCR\TesseractOCR;
require 'vendor/autoload.php';
if ($_SERVER['REQUEST_METHOD'] == 'POST') {
if (isset($_POST['submit'])) {
$file_name = $_FILES['file']['name'];
$tmp_file = $_FILES['file']['tmp_name'];
if (!session_id()) {
session_start();
$unq = session_id();
}
$file_name = uniqid() . '_' . time() . '_' . str_replace(array('!', "@", '#', '$', '%', '^', '&', ' ', '*', '(', ')', ':', ';', ',', '?', '/' . '\\', '~', '`', '-'), '_', strtolower($file_name));
if (move_uploaded_file($tmp_file, 'uploads/' . $file_name)) {
try {
$fileRead = (new TesseractOCR('uploads/' . $file_name))
->setLanguage('eng')
->run();
} catch (Exception $e) {
echo $e->getMessage();
}
} else {
echo "<p class='alert alert-danger'>File failed to upload.</p>";
}
}
}
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Document Reader</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
</head>
<body>
<div class="container mt-5">
<div class="row mt-5">
<div class="col-sm-8 mx-auto">
<div class="jumbotron">
<h1 class="display-4">Read Text from Images</h1>
<p class="lead">
<?php if ($_POST) : ?>
<pre>
<?= $fileRead ?>
</pre>
<?php endif; ?>
</p>
<hr class="my-4">
</div>
</div>
</div>
<div class="row col-sm-8 mx-auto">
<div class="card mt-5">
<div class="card-body">
<form action="" method="post" enctype="multipart/form-data">
<div class="form-group">
<label for="filechoose">Choose File</label>
<input type="file" name="file" class="form-control-file" id="filechoose">
<button class="btn btn-success mt-3" type="submit" name="submit">Upload</button>
</div>
</form>
</div>
</div>
</div>
</div>
<script src="https://code.jquery.com/jquery-3.6.1.min.js"></script>
</body>
</html>
I am not familiar with php, but
Error! The command "tesseract" was not foundwhich means that tesseract is not in thePATHthat is available/accessible for your script.Have a look at the configuration of php library, which provides Tesseract functionality, if you can define the absolute path to Tesseract. Or adjust responsible environment/php variables so you can use your tesseract installation in your script.