I need to add these LSA topics to each corresponding topic in my data frame. How can I get this print statement output in a data frame?
--> I am trying to get a data frame with the topic numbers and their corresponding keywords in a different column.
# most important words for each topic
vocab = vect.get_feature_names()
for i, comp in enumerate(lsa_model.components_):
vocab_comp = zip(vocab, comp)
sorted_words = sorted(vocab_comp, key= lambda x:x[1], reverse=True)[:3]
print("Topic "+str(i)+": ")
for t in sorted_words:
print(t[0],end=" ")
print("\n")
topic 1: xxx yyy zzz . . . Topic 8: fddd dddd dsdsd
Topic 9: akah ahkha ahkha
Assuming you have a data frame named
dfwhere the LSA topics are stored as integers under the column namedf['topics]You could do the following: