Langchain StuffDocumentsChain is not stopping

84 Views Asked by At

I'm using StuffDocumentsChain in my llm Q&A app, the model is Mistral 7b v0.2 instruct. I'm using load_qa_chain from langchain.chains.question_answering.

The chain is tarting to generate correct response, but it stops way to late and after finishing generation of valid response, it's generating lot of garbage.

docs = [.....]
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
llm = LlamaCpp(
        model_path=model_path,
        n_gpu_layers=-1,
        n_batch=n_batch,
        callback_manager=callback_manager,
        temperature=0.0,
        n_ctx=8192,
        top_p=0.001,
        f16_kv=True,
        verbose=True,
        n_threads=10,
        top_k=2,
        repeat_penalty=1.07,
        use_mlock=True,
        max_tokens=4096,
        stop=['</s>', '[INST]', '[/INST]']
    )
template = """<s>[INST]{context}\n{question}\n[/INST]"""
prompt = PromptTemplate(
            template=template,
            input_variables=["context", "question"]
        )
llm_chain = load_qa_chain(llm=sllm, prompt=prompt)
llm_answer = llm_chain({"input_documents": docs, "question": question,
                                "context": docs}, return_only_outputs=True)['output_text']

Is there anything that I'm missing or doing wrong? How can I make chain to stop at correct place?

0

There are 0 best solutions below