How to overcome infinite value to perform multicollinearity test in environmental variables?

328 Views Asked by At

Here's the dataset which consists of X = 45 columns collected the data from bioclimate database. The multicollinearity test model -

from statsmodels.stats.outliers_influence import variance_inflation_factor
vif_data = pd.DataFrame()
vif_data["feature"] = X.columns

x columns

#Calculating VIF for each feature
vif_data["VIF"] = [variance_inflation_factor(X.values, i)
              for i in range (0, len(X.columns))]
-------------------------------------------------------------
/usr/local/lib/python3.7/dist-packages/statsmodels/stats/outliers_influence.py:193: 
RuntimeWarning: divide by zero encountered in double_scalars
vif = 1. / (1. - r_squared_i)

vif_data

Trial :

  • I've converted all variables into float and int vice-versa but still getting infinite values for all variables after performing multicollinearity test.

  • I didn't find any reference material to tackle this problem specially in python. Please help me out, I am using it for species distribution modelling.

0

There are 0 best solutions below