PyStan (imported as Stan) is giving me an unexpected error
import stan
# Load data (replace X and y with your actual data)
X = np.linspace(1,100,100).reshape([100,1])
y = np.random.randint(0,2,100).reshape([100,1])
N = y.shape[0]
K = X.shape[1]
data = {'N': N, 'K': K, 'X': X, 'y': y}
# Compile Stan model
posterior = stan.build(lr_model, data=data, random_seed=1)
# Fit the model
samples = posterior.sample(data=data, iter=1000, chains=4)
Error:
Exception: mismatch in number dimensions declared and found in context; processing stage=data initialization; variable name=y; dims declared=(100); dims found=(100,1) (in '/tmp/httpstan_ii3yhtja/model_776ng6bg.stan', line 7, column 2 to column 35)
I don't understand why Stan is expecting 100 not (100,1) and how to fix. I've tried using lists instead of arrays but the same error was returned.
Here's the script
lr_model = """
data {
int<lower=0> N; // number of observations
int<lower=0> K; // number of predictors
matrix[N, K] X; // predictor matrix
array[N] int<lower=0, upper=1> y; // binary response
}
parameters {
vector[K] beta; // regression coefficients
}
model {
beta ~ normal(0, 1);
y ~ bernoulli_logit(X * beta);
}
"""