What are the main differences between Python and R splines?

50 Views Asked by At

I am trying to develop a model using natural cubic splines in Python. I have some background using splines in R but I need to reproduce in Python.

In R, this is how I am doing the model:

library(splines)
formula <- as.formula('y ~ x1 + x2 + ns(x3,df=3)')
model <- glm(formula, data = results, family = poisson())

R GLM Results

In python, I am using the patsy library and the code:

# Variable spline predictor
x3_ns = dmatrix("cr(x3,df=3)-1", {"x3_ns": x3}, return_type='dataframe') 

# Create a dataframe in order to concatenate the splines df too
array_x= np.concatenate([y, x_pred], axis =1)
columns = ['y','x1','x2']
df = pd.DataFrame(data=array_x, columns = columns)

#Concatenate the variables with the splines 
df_final = pd.concat([df,x3_ns],axis=1) 
df_final.rename(columns={'cr(x3, df=3)[0]': 'x3_sp_1', 'cr(x3, df=3)[1]': 'x3_sp_2','cr(x3, df=3)[2]': 'x3_sp_3'}, inplace=True)

# Formula creation 
formula = 'y ~ x1 + x2 + x3_sp_1 + x3_sp_2 + x3_sp_3'

# GLM 
poisson_model = sm.GLM.from_formula(formula, data=df_final, family=sm.families.Poisson(), missing='drop')
glm = poisson_model.fit()

Python GLM Results

Since the model outputs were not exactly the same, anyone know the reason of this differences? What I observed is that for the x1 and x2 coefficients the differences were little but with the intercept and even the splines coefficientes were completely distincts. If anyone could help I would appreciate! Thankss

0

There are 0 best solutions below