Create embeddings for string matching

203 Views Asked by user3585510 At 16 October 2023 at 22:36

I have 4 lists of companies names. Lets take a company Google. In List A, Google is written as Google Ltd, In 2nd list, it is written as Google Inc (extended etc), 3rd contain Beta Gogl (misspelled etc), 4th contain ABC Googl. I want to create embedding(vector index/vector store) for the all the names in the 4 lists.

When a new word (company name) comes in, i generate an embedding and find the closest match to it.

One approach is to not use embedding but create some edit distance(Levensthein etc) but then find the most similar one. The issue is if i have 1000s of names in each list, it will costs a lot of computation each time i want to similar one(lets say string matching is done 1000 times a day)

So i want to create some embeddings vector store so i can just find the similarity quickly.

GloVe can be the option but i am not sure if it works with names only,(works good on sentences).

Any other approach recommendation would also be great.

Original Q&A

Create embeddings for string matching

There are 0 best solutions below

Related Questions in STANFORD-NLP

Related Questions in STRING-MATCHING

Related Questions in EMBEDDING

Related Questions in LEVENSHTEIN-DISTANCE

Trending Questions

Popular # Hahtags

Popular Questions