Semantic Chunking with Langchain on FAISS vectorstore

57 Views Asked by user17811469 At 25 March 2024 at 23:42

I have this Langchain code for my own dataset:

from langchain_community.vectorstores import FAISS
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

vectorstore = FAISS.from_texts(
    docs, embedding=OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)
)
retriever = vectorstore.as_retriever()

and I want to add semantic chunking for the dataset (docs) before (or after if possible) I save them to the vector store. Specifically, I have been trying to add the following snippet before the previous code:

from langchain_experimental.text_splitter import SemanticChunker

text_splitter = SemanticChunker(OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY))
docs = text_splitter.create_documents(docs)

to convert docs into chunked format but it doesn't work possibly because the structure is different.

Has anyone tried and succeeded in this before?

Original Q&A

There are 1 best solutions below

j3ffyang On 26 March 2024 at 04:43

Try

docs = text_splitter.create_documents([docs])

which expects a [list]. Reference > https://python.langchain.com/docs/modules/data_connection/document_transformers/semantic-chunker

Semantic Chunking with Langchain on FAISS vectorstore

There are 1 best solutions below

Related Questions in LANGCHAIN

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in CHUNKING

Trending Questions

Popular # Hahtags

Popular Questions