Narrow down pkl depedencies

16 Views Asked by At

I am using cloudpickle.load and sys.modules to narrow down the dependencies for requirements.txt of the pickle. However, the dependencies generated are not the same across environments. When I run the code in a Conda environment, I observe sys.modules containing dependencies that were not part of the pickle file, such as NumPy. Is there any way to fix the behavior of the sys.modules cache across environments?

def get_currently_used_packages():
    """Placeholder docstring"""
    all_installed_packages = get_all_installed_packages() # from pip 
    package_to_file_names = map_package_names_to_files([x["name"] for x in all_installed_packages])

    currently_used_files = {
        Path(m.__file__)
        for m in sys.modules.values()
        if inspect.ismodule(m) and hasattr(m, "__file__") and m.__file__
    }

    currently_used_packages = set()
    for file in currently_used_files:
        for package in package_to_file_names:
            if file in package_to_file_names[package]:
                #logger.info(f"Capturing package -- {package}")
                currently_used_packages.add(package)
    return currently_used_packages, currently_used_files

with open(pkl_path, mode="rb") as file:
    cloudpickle.load(file)

get_currently_used_packages() # using to generate requirements.txt


0

There are 0 best solutions below