So I have a python library that is dependent on PyTesseractOCR and on Poppler. Both of these should be installed and in PATH. But I dont like that the end user will have to install multiple other libraries into PATH as its not very user-friendly.
How can I use these programs without demanding the user to install them and put them in path?
I have tried putting the folders in my own Pypi 'src' folder to ship them together with the rest of the code, but:
- PyTesseractOCR has a path-director like this
pytesseract.pytesseract.tesseract_cmd = r'src\Tesseract-OCR\tesseract.exe'but is way too big for pypi's standards (450MB instead of 100MB limit). - Poppler doesnt seem to have a path-director function like Tesseract does.
I have also tried to look online for adding them to PYTHONPATH but it is unclear to me if this can work, or if its a recommended way of doing things.
Any other ideas are very welcome
ChatGPT has these two recommendations:
- Update setup.py Open your setup.py file and add the following lines at the top, below the existing imports:
import subprocess
# Install Poppler and PyTesseract using pip
subprocess.run(["pip", "install", "poppler-utils>=0.70.0", "pytesseract>=0.3.8"])
- Update pyproject.toml Open your pyproject.toml file and add the following lines under the [build-system] section:
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
[tool.poetry.dependencies]
python = "^3.6"
poppler-utils = "^0.70.0"
pytesseract = "^0.3.8"