My project was up-and-running for a while running in a kubernetes container... until, I decided to "clean-up" my use of the sys.add calls that I had at the top of my modules. This included describing my dependencies in pyproject.toml, and all-together ditching setup.py; it imported setup tools, called setup() when __main__.
The design intent is not to run anything in /tnc/app as a script. But rather, a collection of modules, or a package. The only part of the codebase that serves as a __main__ is the api.py file. It initializes and fires-up flask.
Implementation
I have a lean deployment setup that consists of the following:
- the core library in
/opt/venv - my package
/app/tnc - and the entry point
/app/bin/api
I kick-off the flask app with: python /app/bin/api.
The build takes place in the python:3.11-slim docker image. Here I install the recommended gcc and specify the following in the dockerfile:
-- build
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY pyproject.toml project.toml
RUN pip3 install -e . -- << aside: better would be to use python -m pip3 install -e .
I then copy the following from the build into my runtime image.
-- runtime
ENV PATH "/opt/venv/bin:$PATH"
ENV PYTHONPATH "/opt/venv/bin:/app/tnc"
COPY --chown=appuser:appuser bin bin
COPY --chown=appuser:appuser tnc tnc
COPY --chown=appuser:appuser config.py config.py
COPY --from=builder /opt/venv/ /opt/venv
As I mentioned, in the kubernetes deployment I fire-up the container with:
command: ["python3"]
args: ["bin/api"]
My observations working to find the solution
Firing up the container in such a way that I can run the python REPL:
import flaskgeneratesAttributeError ...replace(' -> None', '')- remove
/app/tncfrom thePYTHONPATH,import flaskgeneratesModuleNotFound ... no tnc
AttributeError ...replace(' -> None', '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/venv/lib/python3.10/site-packages/werkzeug/__init__.py", line 2, in <module>
from .test import Client as Client
File "/opt/venv/lib/python3.10/site-packages/werkzeug/test.py", line 35, in <module>
from .sansio.multipart import Data
File "/opt/venv/lib/python3.10/site-packages/werkzeug/sansio/multipart.py", line 19, in <module>
class Preamble(Event):
File "/usr/local/lib/python3.10/dataclasses.py", line 1175, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
File "/usr/local/lib/python3.10/dataclasses.py", line 1093, in _process_class
str(inspect.signature(cls)).replace(' -> None', ''))
AttributeError: module 'inspect' has no attribute 'signature'
ModuleNotFoundError: No module named 'tnc'
appuser@tnc-py-deployment-set-1:/app$ echo $PYTHONPATH
/opt/venv/bin
appuser@tnc-py-deployment-set-1:/app$ echo $PATH
/opt/venv/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
appuser@tnc-py-deployment-set-1:/app$ python -m /app/bin/api
/opt/venv/bin/python: No module named /app/bin/api
appuser@tnc-py-deployment-set-1:/app$ python /app/bin/api
Traceback (most recent call last):
File "/app/bin/api", line 12, in <module>
from tnc.s3 import S3Session
ModuleNotFoundError: No module named 'tnc'
The project structure
├── bin
│ └── api
├── config.py
├── pyproject.toml
└── tnc
├── __init__.py
├── data
│ ├── __init__.py
│ ├── download.py
│ ├── field_types.py
│ └── storage_providers
├── errors.py
├── inspect
│ ├── __init__.py
│ └── etl_time_index.py
├── test
│ ├── __init__.py
│ └── test_end-to-end.py
├── utils.py
└── www
├── __init__.py
└── routes
├── __init__.py
├── feedback.py
├── livez.py
└── utils.py
pyproject.toml
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
[tool.setuptools.packages.find]
where = ["./"]
exclude = [ "res", "notes" ]
dependencies = [ ... with version specs ]
First, I have to shout-out to the
pyproject.toml+setuptoolsteam: the documentation and implementation has gotten good. It allowed me to get a lot more specific and "deterministic" :)) about my setup. Not to mention, a bit more aggressive in the build process.Fixing the "not found" errors
The fix included the following:
pyproject.tomlwith the followingI included a
__init__to mark each submodule.config.pyfile into thebindirectory. This location captured my design intent. Changes to theapi.pyfile...PYTHONPATHenv value to "/app", the location of thetncandbindirectories. By no means a best practice, but in this case, given my determination to havebinseparate fromtnc, the only way that made sense. This use case seemed the right way to go.Improved build process
Finally, while there are a few well known techniques to maximize the reuse of the cache when building the docker image, I wanted to call out how easy it was to know precisely what was going on during the build, made possible by the latest
setuptoolconfigured withpyproject.toml.A. It was trivial to first run the build using empty stub for where the app code would eventually go.
... paired with the 2 phased build (the image is an official docker python image)
B. It was clear what to copy from the now consolidated build artifacts, into my image used for distribution
In the kube deployment, despite being able call the entry point configured using
pyproject.toml, I chose to call theapi.pyas a script.Conclusion
I have an improved design that no longer includes "ad-hoc" calls to
sys.path, nor resorts to "polluting" thePYTHONPATH. The single entry I now have,/app`, conveys an important design choice: wanting to have the entry point be in a separate root directory.