Can BERTopic model correlate topic with unique id in other column?

26 Views Asked by At

I have build successfully a topic_model using BERTopic's libraries on a specific feature/column of my original dataframe which contains other features such as a unique id for each row in my documents. Those documents consist of a string that describes that unique id without containing it. That is a description of a specific fault on a DSLAM and the unique id is the Siebel id that runs for that fault.

Here is a row from my dataframe:

SR Number Priority Fault Synopsis Reported By SR_Num_c Note Notes Fault_Synopsis_proc Notes_proc
1-142482542353 MAJOR ΔΙΑΚΟΠΗ ΛΕΙΤΟΥΡΓΙΑΣ/ΥΠΗΡΕΣΙΑΣ DSLAM 10547 ΠΡΟΜ... Diligent 2 [Diligent Info:\nDslam_code: 10547\nEETT: 4900... Diligent Info:\nDslam_code: 10547\nEETT: 49002... διακοπη λειτουργιας/υπηρεσιας dslam 10547 προμ... diligent info:\ndslam_code: 10547\neett: 49002...

The feature that is fed to the BERTopic model is the "Notes_proc" and what I am asking for is if there is any possibility after the model is built to correlate the topics made out of this feature to the "SR Number" feature.

I did try to split them/correlate them using classes but this is not the meaning of the BERTopic's classes at all.

classes = df['SR Number']

topics_per_class = topic_model.topics_per_class(docs, classes=classes)

I do not know though if I join the string from "Notes_proc" to that of the "SR Number" together if using the same classes method will work. Note that the number of topics is far less than the number of unique ids (thus SR Numbers)

0

There are 0 best solutions below