mlflow.pyfunc.log_model exclude unused Python packages

46 Views Asked by At

I warap model with mlflow.pyfunc in my_app like:

class SetfitModelWrapper(mlflow.pyfunc.PythonModel):

    def __init__(self, model: SetFitModel):
        self.model = model

    def predict(self, context, model_input):
        pass

wrapped_model = SetfitModelWrapper(model=model)
mlflow.pyfunc.log_model(artifact_path='./artifact', python_model=wrapped_model)

Then when i load model in Flask container on prod server like app.main = mlflow.pyfunc.load_model(model_parameters.name):

Traceback (most recent call last):

Error: While importing 'app.main', an ImportError was raised:

Try 'python -m flask run --help' for help.
Usage: python -m flask run [OPTIONS]

ModuleNotFoundError: No module named 'my_app'
                   ^^^^^^^^^^^^^^^^^^^
    python_model = cloudpickle.load(f)
  File "/app/venv/lib/python3.11/site-packages/mlflow/pyfunc/model.py", line 327, in _load_pyfunc
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
  File "/app/venv/lib/python3.11/site-packages/mlflow/pyfunc/__init__.py", line 630, in load_model
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    app.model = mlflow.pyfunc.load_model(model_parameters.name)
  File "/app/app/main.py", line 42, in <module>
2024-01-18 12:12:07.001 | ERROR    | app.main:<module>:50 - Traceback (most recent call last):

ModuleNotFoundError: No module named 'my_app'

Looks like mlflow.pyfunc.log_model automatically included with cloudpickle all possible packages from Python environment. Yes, i can add it to prod requirements, but i see a lot of unused modules in this case. How can I manually specify only the necessary modules for the model to work, or exclude unnecessary ones?

0

There are 0 best solutions below