LangChain: Local file's content is retrieved correctly but the LLM returns strange results

90 Views Asked by At

I am trying to make Langchain work locally so that files on my desk are being searched for a query and the result will then be processed by the LLM to interact with the user. For this, I use LlamaCpp and TextLoader from the langchain_community package. The problem I encounter: The retrieval of the content in my files works fine but somehow the LLM doesn't work like it is supposed to be.

Firstly, the code. It is fairly simple, I took most of it from the documentation:


from langchain_core.prompts import PromptTemplate
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.llms import LlamaCpp
from langchain_community.embeddings import GPT4AllEmbeddings
from langchain_community.vectorstores import Chroma

def loadText():
    loader = TextLoader("./data/test_2.txt")
    return loader.load()

def useChain2():
    # Prompt
    prompt = PromptTemplate.from_template(
        "Summarize the main themes in these retrieved docs: {docs}"
    )

    # Metal set to 1 is enough.
    n_gpu_layers = 1
    # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.
    n_batch = 512

    # Make sure the model path is correct for your system!
    llm = LlamaCpp(
        model_path="./models/llama-2-7b-chat.Q4_K_M.gguf",
        # model_path="./models/mixtral-8x7b-v0.1.Q4_K_M.gguf",
        n_gpu_layers=n_gpu_layers,
        n_batch=n_batch,
        n_ctx=2048,
        f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
        verbose=True,
    )

    # Chain
    def format_docs(docs):
        return "\n\n".join(doc.page_content for doc in docs)


    chain = {"docs": format_docs} | prompt | llm | StrOutputParser()

    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
    all_splits = text_splitter.split_documents(loadText())
    vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings())

    # Run
    question = "What is Dot Voting?"
    docs = vectorstore.similarity_search(question)
    test = chain.invoke(docs)

    print("Result: " + test)

useChain2()

Now, my results. The first and the third screenshot show the return value of the chain that prepares the input for the LLM for two different LLM models (Llama and Mistral). The second and fourth screenshot show the result of the LLM. We see that the question for the llm was well formulated and also based on what was found in the files. However, the LLM does not refer to this question at all and in the first example it actually answers with only one word: "stakeholders".

Now, I am somehow clueless what I am doing wrong. The second model is the mid-level one of Mistral, i.e., a fairly big and well suited one. But somehow it does also mess my prompt up.

Where am I going wrong?

enter image description here

enter image description here

enter image description here

enter image description here

0

There are 0 best solutions below