I am experimenting with Vector Error Correction Models using the statsmodels package in Python. I am not sure how to correctly create a VECM model with two constant terms, one inside the cointegration equation, and one outside it (unrestricted constant).
The user guide explains two methods for specifying deterministic terms in the VECM model:
- Using the
deterministicargument. Passingdeterministic = "ci"adds a constant term only inside the CE (restricted constant). Passing"co"adds a constant term only outside the CE. It's possible to combine one constant term with one linear trend term, for example"cili"creates a constant & linear trend term in the CE only. It's not possible to use both"co"and"ci"according to the documentation forvecm.VECM. I have tried both"co"and"ci"and they indeed only create only one constant term, outside or inside the CE respectively. - Using the
exogandexog_cointarguments to manually create & pass the deterministic terms outside & inside the CE respectively. For example, to create a constant term in the CE, the user guide suggests passingexog_coint = np.ones(len(data)). However, the documentation for thevecm.VECMfunction suggests the shape ofexogandexog_cointmust match the dimension of the endogenous variables passed toendog.
I have 3 endogenous variables. To try and include a constant term in both the CE and outside it, I have tried several approaches:
Approach 1
Pass deterministic = "co", and exog_coint as an ndarray of ones, for each endogenous variable, with the shape (len(train), 3):
# Create deterministic terms for each endogenous variable
exog_coint = np.array(
[np.ones(len(train)), np.ones(len(train)), np.ones(len(train))]
)
exog_coint = exog_coint.transpose()
# Specify VECM
model_vecm = vecm.VECM(
endog = train, # Endogenous variables, dataframe with 3 columns
k_ar_diff = selected_lags, # Order of lags
coint_rank = 1, # N. of cointegrating relationships
exog_coint = exog_coint, # Constant term inside the CE
deterministic = "co" # Constant term outside the CE
)
# Fit VECM
res_vecm = model_vecm.fit()
# Print summary
res_vecm.summary()
This approach yields the LinAlgError: Singular matrix error when trying to fit the model. From this, I assume the shape of exog_coint has to be (len(train), coint_rank) instead: One set of constants for each cointegrating relationship (just one in my case).
Approach 2
Pass deterministic = "co" and exog_coint = np.ones(len(train)), passing only one 1D array of constants. The code for this approach is the same as above, except for creating exog_coint. This time, the model is fitted with no error, but trying to print the model summary yields the following error:
RuntimeWarning: invalid value encountered in sqrt
return np.sqrt(np.diag(self.cov_params_default))
IndexError: tuple index out of range
res_vecm.const_coint returns a zero value, but res_vecm.det_coef_coint does yield a non-zero constant term. Problem is, this value is astronomically large: 5.2849e+19 compared to my endogenous variables mostly in the ten thousands. I do not know if this is correct. The fitted values and predictions seem fine, but the vecm.select_order function also does not work:
# Select lag orders for VECM
res_lags = vecm.select_order(
data = train,
maxlags = 15,
exog_coint = exog_coint, # Created as np.ones(len(train))
deterministic = "co"
)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has
2 dimension(s) and the array at index 1 has 1 dimension(s)
This error suggests some sort of mismatch between the endogenous variables argument (data) and the exog_coint argument, but the same arguments work fine in fitting the VECM.
Approach 3
Do not use the argument deterministic. Instead, create & pass constant terms both outside and inside the CE, with arguments exog and exog_coint respectively.
# Create deterministic terms outside the CE for each endogenous variable
exog = np.array(
[np.ones(len(train)), np.ones(len(train)), np.ones(len(train))]
)
exog = exog_coint.transpose()
# Create one deterministic term inside the CE
exog_coint = np.ones(len(train))
# Specify VECM
model_vecm = vecm.VECM(
endog = train, # Endogenous variables, dataframe with 3 columns
k_ar_diff = selected_lags, # Order of lags
coint_rank = 1, # N. of cointegrating relationships
exog = exog, # Constant term outside the CE
exog_coint = exog_coint # Constant term inside the CE
)
# Fit VECM
res_vecm = model_vecm.fit()
# Print summary
res_vecm.summary()
Like approach 1, this approach yields a LinAlgError: Singular matrix error when trying to fit the model. vecm.select_order also yields the same error as approach 2:
# Select lag orders for VECM
res_lags = vecm.select_order(
data = train,
maxlags = 15,
exog = exog,
exog_coint = exog_coint
)
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has
1 dimension(s) and the array at index 1 has 2 dimension(s)
Approach 4
Use exog and exog_coint, but pass a 1D array of np.ones(len(train)) to both arguments. This approach fits the model without error, and yields the same fitted values, predictions & parameters as approach 2. It also yields the same error when trying to print the model summary. In addition, vecm.select_order yields the following error:
# Select lag orders for VECM
res_lags = vecm.select_order(
data = train,
maxlags = 15,
exog = exog_coint, # Created as np.ones(len(train))
exog_coint = exog_coint
)
ValueError: endog and exog matrices are different sizes
This is confusing, as the model seems to be fitted correctly as in approach 2, but the error message for vecm.select_order clearly states exog must have the same shape as endog.
So to sum up, here are my questions:
- How to properly include a constant term both outside and inside the cointegration equation in a
vecm.VECMmodel? Is this possible at all? It seems the VECM function from the R library tsDyn takes two string arguments, one for deterministic terms inside the CE, another for outside. - How to properly use the
exogandexog_cointarguments? What should be the correct shape for each?