I am using MongoDBAtlasVectorSearch and ì want to search for the most similar documents so I use the function similarity_search_with_score.
However, it seems like I am not able to add filters in this similarity_search_with_score function.
This is my code:
vector_search = MongoDBAtlasVectorSearch(
collection=client[os.getenv("MONGODB_DB")]["files"],
embedding=embeddings,
index_name=os.getenv("ATLAS_VECTOR_SEARCH_INDEX_NAME"),
)
results = vector_search.similarity_search_with_score(
query="What are the engagements of the company",
k=5,
pre_filter={
"compound": {
"filter": [
{"equals": {"path": "uploaded_by", "value": chat_owner}},
{"in": {"path": "file_name", "values": file_names}},
]
}
},
)
This is my index:
{
"mappings": {
"dynamic": true,
"fields": {
"embedding": {
"dimensions": 1536,
"similarity": "cosine",
"type": "knnVector"
},
"file_name": {
"normalizer": "lowercase",
"type": "token"
},
"uploaded_by": {
"normalizer": "lowercase",
"type": "token"
}
}
}
}
However, this gives me the following error :
pymongo.errors.OperationFailure: "knnBeta.filter.compound.filter[1].in.value" is required, full error: {'ok': 0.0, 'errmsg': '"knnBeta.filter.compound.filter[1].in.value" is required', 'code': 8, 'codeName': 'UnknownError', '$clusterTime': {'clusterTime': Timestamp(1704804627, 1), 'signature': {'hash': b'\xfa\x15s+Q\x1d\xa86]R\xb2!\x9d\xc5b-G\xce\xa6S', 'keyId': 7283272637088792583}}, 'operationTime': Timestamp(1704804627, 1)}
I also tried like this :
pre_filter={
"$and": [
{"uploaded_by": {"$eq": chat_owner}},
{"file_name": {"$in": file_names}},
]
},
But I got this error:
pymongo.errors.OperationFailure: "knnBeta.filter" one of [autocomplete, compound, embeddedDocument, equals, exists, geoShape, geoWithin, in, knnBeta, moreLikeThis, near, phrase, queryString, range, regex, search, span, term, text, wildcard] must be present, full error: {'ok': 0.0, 'errmsg': '"knnBeta.filter" one of [autocomplete, compound, embeddedDocument, equals, exists, geoShape, geoWithin, in, knnBeta, moreLikeThis, near, phrase, queryString, range, regex, search, span, term, text, wildcard] must be present', 'code': 8, 'codeName': 'UnknownError', '$clusterTime': {'clusterTime': Timestamp(1704802325, 9), 'signature': {'hash': b'`\xd27-\x81+\x16\xd0a\x14\xc7\x99\xa8\x05|Sx?\x0e:', 'keyId': 7283272637088792583}}, 'operationTime': Timestamp(1704802325, 9)}
WARNING: StatReload detected changes in 'src/routes/chats/chats.py'. Reloading...
How can I use filters in the similarity_search_with_score properly ?
Looking at your error message
And based on this answer in the MongoDB Forums looks like your
inclause is usingvaluesinstead ofvalue. As an example: