Trying to understand Bayes theory with direct application

Question

Trying to understand Bayes theory with direct application

36 Views Asked by elksie5000 At 02 February 2024 at 20:22

I'm trying to apply Bayes theorem to a problem of my own so I can understand the methodology and how to set up the numbers.

Essentially, I've got some data on four artists and the number of objects that they make every month over 80 periods.

I'm interested in 'h' and believe there are three possibilities - equally likely - for the last update: 1) Have left work, 2) Have been promoted and have split time between making and managing others, 3) Have been working on a project.

I've used Allen Downey's Think Bayes code to work through the process.

from empiricaldist import Pmf

# Define hypotheses
hypos = ["left", "managing", "project"]

# Define prior probabilities - they are all equally likely
prior = Pmf(1/len(hypos), hypos)

# Display prior probabilities
print("Prior probabilities:")
print(prior)

The result:

Prior probabilities:
left        0.333333
managing    0.333333
project     0.333333
dtype: float64

Code:

# Normalize the data to calculate the likelihoods
normalized_data = df.div(df.sum(axis=1), axis=0)

Normalized Data:

      h       j    n  t
0   0.000000  1.000000  0.000000  0.0
1   0.666667  0.333333  0.000000  0.0
2   0.571429  0.428571  0.000000  0.0
3   0.769231  0.230769  0.000000  0.0
4   0.700000  0.300000  0.000000  0.0

Now I get confused.

from empiricaldist import Pmf

# Define hypotheses
hypos = ["left", "managing", "project"]

    # Calculate the likelihoods using the normalized data
    likelihoods = {}
    for hypo in hypos:
        likelihoods[hypo] = normalized_data.apply(lambda row: row if hypo == "left" else 1, axis=1)

    # Perform Bayesian update to obtain the posterior probabilities
    posterior_history = [prior]
    for hypo in hypos:
        posterior = prior.copy()  # Create a copy of the prior probabilities
        if hypo == "left":
            # Ensure alignment of row labels and perform element-wise multiplication
            for index, row in likelihoods[hypo].iterrows():
                if index in posterior.index:
                    posterior.loc[index] *= row
        posterior /= posterior.sum()  # Normalize the posterior probabilities
        posterior_history.append(posterior)

The output is:

[left        0.333333
 managing    0.333333
 project     0.333333
 dtype: float64,
 left        0.333333
 managing    0.333333
 project     0.333333
 dtype: float64,
 left        0.333333
 managing    0.333333
 project     0.333333
 dtype: float64,
 left        0.333333
 managing    0.333333
 project     0.333333
 dtype: float64]

I was confused by the output for two reasons. 1) the posterior is the same as the prior. 2) There are are four outputs.

Maybe I'm over-complicating this and should just update the values just using the one column of the normalized data.

What can I try next?

I've created a dict of the data, data_to_dict, thus:

{'h': {0: 0.0,
  1: 2.0,
  2: 4.0,
  3: 10.0,
  4: 7.0,
  5: 6.0,
  6: 4.0,
  7: 10.0,
  8: 11.0,
  9: 3.0,
  10: 4.0,
  11: 6.0,
  12: 3.0,
  13: 4.0,
  14: 8.0,
  15: 9.0,
  16: 6.0,
  17: 5.0,
  18: 6.0,
  19: 5.0,
  20: 4.0,
  21: 1.0,
  22: 3.0,
  23: 4.0,
  24: 0.0,
  25: 2.0,
  26: 6.0,
  27: 4.0,
  28: 8.0,
  29: 2.0,
  30: 4.0,
  31: 2.0,
  32: 2.0,
  33: 3.0,
  34: 2.0,
  35: 3.0,
  36: 2.0,
  37: 3.0,
  38: 3.0,
  39: 1.0,
  40: 4.0,
  41: 2.0,
  42: 1.0,
  43: 3.0,
  44: 3.0,
  45: 1.0,
  46: 1.0,
  47: 1.0,
  48: 5.0,
  49: 2.0,
  50: 2.0,
  51: 4.0,
  52: 4.0,
  53: 2.0,
  54: 3.0,
  55: 4.0,
  56: 2.0,
  57: 2.0,
  58: 1.0,
  59: 4.0,
  60: 3.0,
  61: 3.0,
  62: 3.0,
  63: 1.0,
  64: 3.0,
  65: 2.0,
  66: 2.0,
  67: 4.0,
  68: 2.0,
  69: 2.0,
  70: 1.0,
  71: 0.0,
  72: 5.0,
  73: 0.0,
  74: 3.0,
  75: 3.0,
  76: 2.0,
  77: 2.0,
  78: 2.0,
  79: 4.0,
  80: 1.0,
  81: 2.0,
  82: 0.0},
 'j': {0: 2.0,
  1: 1.0,
  2: 3.0,
  3: 3.0,
  4: 3.0,
  5: 2.0,
  6: 1.0,
  7: 9.0,
  8: 7.0,
  9: 4.0,
  10: 0.0,
  11: 3.0,
  12: 6.0,
  13: 2.0,
  14: 5.0,
  15: 4.0,
  16: 1.0,
  17: 2.0,
  18: 2.0,
  19: 3.0,
  20: 6.0,
  21: 6.0,
  22: 3.0,
  23: 4.0,
  24: 5.0,
  25: 3.0,
  26: 2.0,
  27: 1.0,
  28: 4.0,
  29: 0.0,
  30: 1.0,
  31: 0.0,
  32: 0.0,
  33: 2.0,
  34: 2.0,
  35: 1.0,
  36: 0.0,
  37: 4.0,
  38: 2.0,
  39: 0.0,
  40: 0.0,
  41: 2.0,
  42: 2.0,
  43: 1.0,
  44: 2.0,
  45: 1.0,
  46: 1.0,
  47: 2.0,
  48: 0.0,
  49: 1.0,
  50: 1.0,
  51: 2.0,
  52: 0.0,
  53: 0.0,
  54: 0.0,
  55: 1.0,
  56: 2.0,
  57: 1.0,
  58: 0.0,
  59: 1.0,
  60: 0.0,
  61: 1.0,
  62: 1.0,
  63: 1.0,
  64: 2.0,
  65: 0.0,
  66: 2.0,
  67: 2.0,
  68: 5.0,
  69: 1.0,
  70: 2.0,
  71: 2.0,
  72: 3.0,
  73: 0.0,
  74: 3.0,
  75: 0.0,
  76: 1.0,
  77: 2.0,
  78: 5.0,
  79: 3.0,
  80: 1.0,
  81: 4.0,
  82: 2.0},
 'n': {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 0.0,
  9: 0.0,
  10: 0.0,
  11: 0.0,
  12: 0.0,
  13: 0.0,
  14: 0.0,
  15: 0.0,
  16: 0.0,
  17: 0.0,
  18: 0.0,
  19: 0.0,
  20: 0.0,
  21: 0.0,
  22: 0.0,
  23: 0.0,
  24: 0.0,
  25: 0.0,
  26: 0.0,
  27: 0.0,
  28: 0.0,
  29: 0.0,
  30: 0.0,
  31: 0.0,
  32: 0.0,
  33: 0.0,
  34: 0.0,
  35: 0.0,
  36: 0.0,
  37: 0.0,
  38: 0.0,
  39: 0.0,
  40: 0.0,
  41: 0.0,
  42: 0.0,
  43: 0.0,
  44: 0.0,
  45: 0.0,
  46: 0.0,
  47: 0.0,
  48: 0.0,
  49: 0.0,
  50: 0.0,
  51: 0.0,
  52: 0.0,
  53: 0.0,
  54: 0.0,
  55: 0.0,
  56: 0.0,
  57: 0.0,
  58: 0.0,
  59: 0.0,
  60: 0.0,
  61: 0.0,
  62: 0.0,
  63: 0.0,
  64: 0.0,
  65: 0.0,
  66: 0.0,
  67: 0.0,
  68: 0.0,
  69: 0.0,
  70: 0.0,
  71: 0.0,
  72: 0.0,
  73: 1.0,
  74: 3.0,
  75: 6.0,
  76: 8.0,
  77: 2.0,
  78: 3.0,
  79: 2.0,
  80: 2.0,
  81: 5.0,
  82: 2.0},
 't': {0: 0.0,
  1: 0.0,
  2: 0.0,
  3: 0.0,
  4: 0.0,
  5: 0.0,
  6: 0.0,
  7: 0.0,
  8: 6.0,
  9: 3.0,
  10: 4.0,
  11: 8.0,
  12: 2.0,
  13: 5.0,
  14: 5.0,
  15: 3.0,
  16: 7.0,
  17: 3.0,
  18: 4.0,
  19: 2.0,
  20: 5.0,
  21: 1.0,
  22: 2.0,
  23: 2.0,
  24: 2.0,
  25: 1.0,
  26: 1.0,
  27: 6.0,
  28: 4.0,
  29: 5.0,
  30: 2.0,
  31: 3.0,
  32: 6.0,
  33: 1.0,
  34: 2.0,
  35: 1.0,
  36: 2.0,
  37: 1.0,
  38: 2.0,
  39: 1.0,
  40: 0.0,
  41: 2.0,
  42: 2.0,
  43: 2.0,
  44: 2.0,
  45: 2.0,
  46: 3.0,
  47: 0.0,
  48: 2.0,
  49: 5.0,
  50: 3.0,
  51: 4.0,
  52: 0.0,
  53: 1.0,
  54: 1.0,
  55: 0.0,
  56: 3.0,
  57: 1.0,
  58: 1.0,
  59: 0.0,
  60: 1.0,
  61: 1.0,
  62: 1.0,
  63: 2.0,
  64: 0.0,
  65: 1.0,
  66: 1.0,
  67: 0.0,
  68: 0.0,
  69: 0.0,
  70: 0.0,
  71: 0.0,
  72: 0.0,
  73: 0.0,
  74: 0.0,
  75: 0.0,
  76: 0.0,
  77: 0.0,
  78: 0.0,
  79: 0.0,
  80: 0.0,
  81: 0.0,
  82: 0.0}}

df = pd.DataFrame(df_to_dict)

Original Q&A

Trying to understand Bayes theory with direct application

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in BAYESIAN

Trending Questions

Popular # Hahtags

Popular Questions