I am making some rather big Bayesian Networks for generating synthetic data, and I find pomegranate to be a good alternative as it generates data quickly and easily allows for inputting evidence. I have one problem with it: saving the trained models. Pomegranate's built-in methods stores as json's so big that I run out of memory when I have 30 or so variables, even when using "lighter" algorithms. The models can not be pickled due to the error
TypeError: self.distributions_ptr,self.parent_count,self.parent_idxs cannot be converted to a Python object for pickling
I am wondering if anyone has a good alternative for storing pomegranate models, or else knows of a Bayesian Network library that generates data quickly after training. I would be grateful for any tips.
if your model can be learned and stored in the memory, it can be saved in a file, but maybe not by 'pickling'. There are many different formats for Bayesian networks (
bif,xmlbif,dsl,uai, etc.). I don't know pomegranate, but there is certainly a way to read/save using such a format. With pyAgrum (of which I am one of the authors), you just have to writegum.saveBN(model, "model.xxx")to save it, and thenbn=gum.loadBN("model.xxx")to read it ... You can choosexxxamong all the supported format, for now :bif|dsl|net|bifxml|o3prm|uai(https://pyagrum.readthedocs.io/en/1.3.1/functions.html#pyAgrum.loadBN).As far as I understand, evidence for a sampling is just a way to filter the samples by keeping only the samples that respect the constraints (rejection sampling). There is no such a direct method in pyAgrum but this is can be done as a post-process :
And in a notebook :