Langchain | How to make use of metadata attribute while retrieving documents from vector store after text-chunked with HTMLHeaderTextSplitter

2.5k Views Asked by At

I have created chunks using HTMLHeaderTextSplitter and I have only one key with different value in metadata {"header": "something going on"} for each chunked document and while retrieving documents from vector store based on query I also want to look in metadata if it has found word(s) to bring that document too.

Right now I am using PGVector but can switch to other as well if solution is there


store = PGVector(
    collection_name=COLLECTION_NAME,
    connection_string=CONNECTION_STRING,
    embedding_function=embeddings,
)
retriever = store.as_retriever()

vector_dbqa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever= retriever,
    return_source_documents=True,
    verbose=True,
    chain_type_kwargs=chain_type_kwargs,

)

Any help would be much appreciated!

Tried code mentioned above.

1

There are 1 best solutions below

0
kenny On
You can add the text and metadata as follows after creating the PGVector object:
#param:text_list list format
#param:metadatas dictionary format {"header": "something going on"
store = PGVector(
    collection_name=COLLECTION_NAME,
    connection_string=CONNECTION_STRING,
    embedding_function=embeddings,
)
store.add_texts(texts=text_list,metadatas=metadatas)
retriever = store.as_retriever()

vector_dbqa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever= retriever,
    return_source_documents=True,
    verbose=True,
    chain_type_kwargs=chain_type_kwargs,)