I want to apply the "bayes factor" method in Bybee, Leland and Kelly, Bryan T. and Manela, Asaf and Xiu, Dacheng, Business News and Business Cycles, forthcoming in the Journal of Finance. Therefore, I would like to calculate the posterior probability of a LDA model with selected topic numbers, so that I can compare different models with various topic numbers. I try to use the "LdaState" in gensim, however, I failed to get the right parameters. Can anyone kindly tell me how to use the "LdaState"?
For example:
eta = lda0.eta
lamda = LdaState(eta, shape=((i, 10),)).get_lambda()
File "C:\Users\AppData\Roaming\JetBrains\PyCharmCE2023.2\scratches\scratch_10.py", line 71, in run
lamda = LdaState(eta, shape=((i, 10),)).get_lambda()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\User\PycharmProjects\pythonProject\venv\Lib\site-packages\gensim\models\ldamodel.py", line 174, in __init__
self.sstats = np.zeros(shape, dtype=dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'tuple' object cannot be interpreted as an integer
I appreciate the help of jejemalani, and I still need some detailed examples. Please see the comments.
Changing your code to the following should fix your problem, given
iis an integer.Taken from the Gensim LDA Model Docs
Bases: SaveLoad
Parameters
eta (numpy.ndarray)– The prior probabilities assigned to each term.shape (tuple of (int, int))– Shape of the sufficient statistics: (number of topics to be found, number of terms in the vocabulary).dtype (type)– Overrides the numpy array default types.dtypeis optional. Make sure the rest of the parameters you pass match the above.