I am converting SAS Arima process to Python Statmodels. And the model that I have is an ARIMAX that is built with Proc ARIMA in SAS. My question is, how can I implement "Conditional Least Squares Estimation" in statsmodels SARIMAX? SAS has the "Conditional Least Squares Estimation" as its default optimization in its Proc ARIMA statement. But, Statsmodel does not have it. I would like to get the same coefficients in Statsmodels as in SAS. However, statsmodels does not seem to have the same optimization process as SAS.
SAS Proc Arima statements gives the const coefficient: 119.16860 AR lag 1 coefficient: 0.93610 for the AR-1 process with constant values of ones.
However, Statsmodels gives totally different coefficient values. I know what SARIMAX is, I am well-versed in the time-series forecasting and the following code is just only to explain the problem. And the problem (different coefficient values) seem to occur because Statsmodels doe not have "conditional least squares estimation" as its optimization.
Statsmodels code is as follows.
import pandas as pd
import numpy as np
from datetime import date
import statsmodels.api as sm
AIR = pd.DataFrame({'air': [112, 118, 132, 129, 121, 135, 148, 148, 136, 119, 104, 118,115, 126, 141, 135, 125, 149, 170, 170, 158, 133, 114, 140,145, 150, 178, 163, 172, 178, 199, 199, 184, 162, 146,166]})
AIR.index = pd.date_range(start = '1949/1/1', periods = AIR.shape[0], freq = 'M')
AIR.index = [date(x.year, x.month, 1) for x in AIR.index]
AIR.loc[:,'const'] = list(np.ones(AIR.shape[0]))
mod = sm.tsa.statespace.SARIMAX(AIR['air'], AIR['const'], order=(1,0,0), trend = 'n')
results = mod.fit(maxiter=50)
results.summary()
I would like to get the same coefficient parameters in statstmodels as SAS ARIMA process. But, it seems as statsmodels does not have "conditional least squares" optimization for the SARIMAX.
SARIMAX in statsmodels uses Maximum Likelihood estimation. If you set PROC ARIMA to use ML estimation, you'll get nearly identical estimates. Other than that, you cannot make them the same unless SARIMAX supports CLS estimation in the future.
SAS:
statsmodels: