Issue of replicating a weighted dataset using np.hist

18 Views Asked by At

I have a problem where I'm trying to emulate a weighted data set, and when I use np.hist including weights as an input I get a massive discrepancy. Here's the plot of that: histograms with weights

Here, the solid blue bins are considered the ground truth, and the orange bins are my attempt at replicating the ground truth. The lines of code used to produce this are:

axes.hist(massBound[subhalos], weights=w, bins = 70, range = (1e6,1e10), label = 'Galacticus', fill = True, edgecolor = 'blue')
axes.hist(em_massBound, weights=em_weights, bins = 70, range = (1e6,1e10), label = 'Emulator', fill = False, edgecolor = 'orange')

However when I look at the components that go into .hist individually everything seems to match. Here's what happens when I execute the same code block as above but deleting the weights input in the axes.hist lines:

histograms without weights

and the distributions of the weights w and em_weights themselves look identical:

weights themselves

Does anybody know if there's something within np.hist that could cause this discrepancy? Thanks!

I tried looking at the inputs of np.hist individually to see if there was a discrepancy and there's not. I expected the weighted histograms to look the same if the unweighted histograms and weight distributions look the same.

0

There are 0 best solutions below