I am trying to model a censored regression based on this example in PyMC3.
The difference from the example is that my data is only right censored, but apart from that not much difference. I have tried it on some mock data and everything works fine, but when I try to use my real data things starts to go wrong and I get a RuntimeError: Chain 1 failed. with a RuntimeWarning: divide by zero encountered in true_divide error and ValueError: Mass matrix contains zeros on the diagonal. The derivative of RV 'intercept'.ravel()[0] is zero. when launching the pm.sample() command.
Here is my code:
def bayesian_censored_regression(x: pd.Series, y: pd.Series, censoring_flag: pd.Series):
"""
Bayesian censored regression using PyMC3
"""
x_rc = x[censoring_flag]
y_rc = y[censoring_flag]
uncensored = np.logical_not(censoring_flag)
x = x[uncensored]
y = y[uncensored]
sigma_prior = np.std(y)
with pm.Model() as model:
slope = pm.Normal("slope", mu=0, sigma=1)
intercept = pm.Normal("intercept", mu=0, sigma=1)
σ = pm.HalfNormal("σ", sd=sigma_prior)
y_likelihood = pm.Normal("y_likelihood", mu=slope * x + intercept, sigma=σ, observed=y)
right_censored = pm.Potential(
"right_censored", normal_lccdf(slope * x_rc + intercept, σ, y_rc)
)
return model
x = df.var1.values.reshape(-1, 1)
y = df.var2.values.reshape(-1, 1)
censoring_flag = df.censored.values.reshape(-1, 1)
censored_model = bayesian_censored_regression(x, y, censoring_flag)
with censored_model:
censored_fit = pm.sample(return_inferencedata=True, draws=10000)
As extra info np.mean(x), np.std(x) is (2613.754382911023, 2010.0931627621942)
and np.mean(y), np.std(y) is (71.57414301625279, 69.44685037573566).
How can I debug this error? how can i fix it? The data can be downloaded from this link.