I am trying to reconstruct in Python the dimensionality reduction method called Sparsity Preserving Projections (SPP) based on the paper "Sparsity preserving projections with applications to face recognition" by Lishan Qiao, Songcan Chen and Xiaoyang Tan.
The steps are: Load dataset (Yale B) Split the dataset into training and testing sets and normalize Apply PCA and preserve 98% of the total variance Apply SPP (snpe) on the PCA transformed data Evaluate using SVM, kNN
My problem is that what ever I do I get an accuracy score of ~3%.
def sparse_representation(X):
# Initialize S matrix
m, n = X.shape
S = np.zeros((n, n))
# Convert X to a matrix type that SPAMS can use
X = np.asfortranarray(X, dtype=np.float64)
# Set the parameters for SPAMS
param = {'lambda1': 1e-6, 'numThreads': -1, 'mode': 2}
# Loop through each sample to find s_i
for i in tqdm(range(n)):
# Define X_temp and x
X_temp = np.delete(X, i, axis=1)
x = X[:, i].reshape(-1, 1)
# Use SPAMS to find the sparse representation
s_i = spams.lasso(x, D=X_temp, return_reg_path=False, **param).toarray()
# Check for negative values and take their absolute value
s_i[np.where(s_i < 0)] = np.abs(s_i[np.where(s_i < 0)])
# Check for near-zero values and set them to zero
s_i[np.where(np.abs(s_i) < 1e-7)] = 0
# Normalize the new s_i matrix
s_i_norm = np.linalg.norm(s_i, ord=1)
if s_i_norm != 0: # Check if the norm is not zero
s_i = s_i / s_i_norm
# Add a zero to the place that has been checked
s_i = np.insert(s_i, i, 0)
# Store s_i in the S matrix
S[:, i] = s_i.T
return S
def snpe(X, n_components):
S = sparse_representation(X)
# Compute S_beta
S_beta = S + S.T - S.T @ S
# Solve the generalized eigenvalue problem
A = X @ S_beta @ X.T
B = X @ X.T
epsilon = 1e-6 # Small positive value
np.fill_diagonal(B, np.diag(B) + epsilon) #add positive value to the diagonal elements of B
w, lamda = scipy.linalg.eigh(A,B)
# Sort the eigenvalues and eigenvectors in descending order
idx = np.argsort(w)[::-1]
w = w[idx]
lamda = lamda[:, idx]
# Normalize the eigenvectors
lamda = normalize(lamda, axis=1)
return lamda[:, :n_components]
I have tried lots of minimize functions to find the sparse representation such as linear regression, Lasso, ElasticNet and still nothing.
The authors have implemented this method using Matlab and I am trying to reconstruct it in Python. For the Yale B dataset according to the paper I should be getting an accuracy of 94%.