Statsmodels SARIMAX vs Proc ARIMA SAS

120 Views Asked by At

I am converting SAS Arima process to Python Statmodels. And the model that I have is an ARIMAX that is built with Proc ARIMA in SAS. My question is, how can I implement "Conditional Least Squares Estimation" in statsmodels SARIMAX? SAS has the "Conditional Least Squares Estimation" as its default optimization in its Proc ARIMA statement. But, Statsmodel does not have it. I would like to get the same coefficients in Statsmodels as in SAS. However, statsmodels does not seem to have the same optimization process as SAS.

SAS Proc Arima statements gives the const coefficient: 119.16860 AR lag 1 coefficient: 0.93610 for the AR-1 process with constant values of ones.

However, Statsmodels gives totally different coefficient values. I know what SARIMAX is, I am well-versed in the time-series forecasting and the following code is just only to explain the problem. And the problem (different coefficient values) seem to occur because Statsmodels doe not have "conditional least squares estimation" as its optimization.

Statsmodels code is as follows.

import pandas as pd                                                      
import numpy as np                                                            
from datetime import date                                                    
import statsmodels.api as sm                                                  

AIR = pd.DataFrame({'air': [112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118,115, 126, 141, 135, 125, 149, 170, 170, 158, 133, 114, 140,145, 150, 178, 163, 172, 178, 199, 199, 184, 162, 146,166]})    

AIR.index = pd.date_range(start = '1949/1/1', periods = AIR.shape[0], freq = 'M')                                                                    
AIR.index = [date(x.year, x.month, 1) for x in AIR.index]                     
AIR.loc[:,'const'] = list(np.ones(AIR.shape[0]))                              

mod = sm.tsa.statespace.SARIMAX(AIR['air'], AIR['const'], order=(1,0,0), trend = 'n')                                                         
results = mod.fit(maxiter=50)                                               
results.summary()                                                             

I would like to get the same coefficient parameters in statstmodels as SAS ARIMA process. But, it seems as statsmodels does not have "conditional least squares" optimization for the SARIMAX.

1

There are 1 best solutions below

0
Stu Sztukowski On

SARIMAX in statsmodels uses Maximum Likelihood estimation. If you set PROC ARIMA to use ML estimation, you'll get nearly identical estimates. Other than that, you cannot make them the same unless SARIMAX supports CLS estimation in the future.

proc arima data=have;
    identify var=air;
    estimate p=1 method=ml maxiter=50;
run;

SAS:

Parameter    Estimate      Standard Error    t Value   Approx Pr > |t|    Lag
MU           144.19262     12.02918          11.99     <.0001             0
AR1,1        0.81980       0.09513            8.62     <.0001             1

statsmodels:

                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const        144.1884     11.317     12.740      0.000     122.007     166.370
ar.L1          0.8197      0.116      7.091      0.000       0.593       1.046
sigma2       203.9844     70.067      2.911      0.004      66.656     341.313