I have built a Doc2Vec model and am trying to get the vectors of all my testing set (176 points). The code below I can only see one vector at a time. I want to be able to do "clean_corpus[404:]" to get the entire data set but when I try that it still outputs one vector.
model.save("d2v.model")
print("Model Saved")
from gensim.models.doc2vec import Doc2Vec
model= Doc2Vec.load("d2v.model")
#to find the vector of a document which is not in training data
test_data = clean_corpus[404]
v1 = model.infer_vector(test_data)
print("V1_infer", v1)
Is there a way to easily iterate over the model to get and save all 176 vectors?
The simplest way (not the cheapest though) is to iterate through the test set and then run it through the
.infer_vector()function.And for multiple sentences:
But looking at the code https://github.com/RaRe-Technologies/gensim/blob/62669aef21ae8047c3105d89f0032df81e73b4fa/gensim/models/doc2vec.py
There's a
.dvwhich means doc vectors that you can use to retrieve the vectors used to train the model. E.g.[out]:
And if we add more 2 sentences:
[out]: