I have the following code (fragment):

    # 6. index documents to Elasticsearch
    kb = ElasticsearchStore(
        es_connection=es,
        index_name=index_name,
        embedding=emb_func,
        strategy=ElasticsearchStore.ApproxRetrievalStrategy(),
        distance_strategy="DOT_PRODUCT"
    )
    print("create vector store in Elasticsearch")

    try:
        _ = kb.add_texts(
            texts=documents['page_content'].tolist(),
            metadatas=[{'source': source} for source in documents['source']],
            index_name=index_name,
            ids=[str(i) for i in range(len(documents))]  # unique for each doc
        )
    except Exception as e:
        print("Failed to index documents:", e)

This is run in the container. It runs fine when run as root, but if I run it as not root user it fails like this:

Error adding texts: 88 document(s) failed to index.
First error reason: [1:8471] failed to parse: The [dot_product] similarity can only be used with unit-length vectors. Preview of invalid vector: [-0.18178311, -0.02720352, 0.04890755, -0.10870888, -0.10545816, ...]

Here is my Dockerfile:

# Dockerfile
## 1: Base image
FROM registry.access.redhat.com/ubi8/ubi-minimal
USER root

## 2. Latest security updates && OS packages
RUN microdnf install -y python3.11
RUN microdnf install -y python3.11-pip
RUN microdnf clean all

## 4. Initialize application sources
WORKDIR /app

## 5. Application source
## Copy the application source and build artifacts from the builder image to this one
COPY --chown=1001:0 requirements.txt ingestion.py ./

# added to make cache writable
RUN chmod -R g+w /app
ENV TRANSFORMERS_CACHE = '/app/cache/'

# 6. Install the dependencies
RUN pip3 install -U "pip>=19.3.1" && \
    pip3 install --no-cache-dir -r requirements.txt
USER 1001


## Run script uses standard ways to run the application
CMD python ingestion.py

Any ideas why it fails when running as non-root?

0

There are 0 best solutions below