If we have a large list of short sentences followed by numbers, (as an example house address and price and square footage) and we want to store these in a vector DB for RAG retrieval later. The elements in the list are ordered based on some criteria and when querying the DB we want the order to influence similarity scores. So if we query the DB based on street address and there are multiple houses in the list on that street, we want the returned results to be the ones that were located closer to the query string in the original list.
So we have a loop that converts each sentence+numbers to embeddings(openai ada model) but we want to add a measure of relative orders to these vectors. Should we just go with sine/cosine positional encoding using their index and append to the vector?
Is there a better way?
Thank you