Mahalanobis distance not equal to Euclidean distance after PCA

808 Views Asked by Lara At 02 November 2021 at 14:04

I am trying to compute the Mahalanobis distance as the Euclidean distance after transformation with PCA, however, I do not get the same results. The following code:

import numpy as np
from scipy.spatial.distance import mahalanobis
from sklearn.decomposition import PCA

X = [[1,2], [2,2], [3,3]]

mean = np.mean(X, axis=0)
cov = np.cov(X, rowvar=False)
covI = np.linalg.inv(cov)

maha = mahalanobis(X[0], mean, covI)
print(maha)

pca = PCA()

X_transformed = pca.fit_transform(X)

stdev = np.std(X_transformed, axis=0)
X_transformed /= stdev

print(np.linalg.norm(X_transformed[0]))

prints

1.1547005383792515
1.4142135623730945

To my understanding, the PCA uncorrelates the dimensions, and the division by the standard deviation weights every dimension equally, so the Euclidean distance should equal the Mahalanobis distance. Where am I going wrong?

Original Q&A

There are 1 best solutions below

jylls On 02 November 2021 at 15:58 BEST ANSWER

According to this discussion, the relationship between PCA and the Mahalanobis distance only holds true with PCA components with unit variance. This can be obtained by applying PCA on the whitened data (more information here).

Once you do that, the Mahalanobis distance in the original space is equal to the euclidean distance in the PCA space. You can see a demonstration of that in the code below:

import numpy as np
from scipy.spatial.distance import mahalanobis,euclidean
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

X = np.array([[1,2], [2,2], [3,3]])

cov = np.cov(X, rowvar=False)
covI = np.linalg.inv(cov)
mean=np.mean(X)
maha = mahalanobis(X[0], X[1], covI)

pca = PCA(whiten=True)
X_transformed= pca.fit_transform(X)

print('Mahalanobis distance: '+str(maha))
print('Euclidean distance: '+str(euclidean(X_transformed[0],X_transformed[1])))

The output gives:

Mahalanobis distance: 2.0
Euclidean distance: 2.0000000000000004

Mahalanobis distance not equal to Euclidean distance after PCA

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PCA

Related Questions in MAHALANOBIS

Trending Questions

Popular # Hahtags

Popular Questions