I am trying to fit FTIR spectra with other reference spectra. I have FTIR spectra of tissue and reference spectra of various components such as DNA, RNA and others. I want to fit the reference spectra into my input spectra. The size and shape of the input files are given as follows. I have performed CLS regression fitting and the fitting plot shows a good fit. However, I am doubtful about the log-liklihood and the BIC values. If any suggestions to improve the code?
import numpy as np
import pandas as pd
from scipy import io
from statistics import variance
import math
input_mat = io.loadmat('[path to input mat file]')
input_spectra = input_mat['ab'].T
input_wavenumbers = np.linspace(1800, 900, input_spectra.shape[1])
ref_spectra_mat = io.loadmat('path to reference spectra mat file')
reference_spectra = ref_spectra_mat['AB'].T
wavenumber_mat = io.loadmat('path to wavenummber mat file')
reference_wavenumbers = wavenumber_mat['wavenumber'][0]
reference_df = pd.DataFrame(reference_spectra, columns=reference_wavenumbers)
input_df = pd.DataFrame(input_spectra, columns=input_wavenumbers)
matched_columns = {iw: reference_wavenumbers[np.argmin(np.abs(reference_wavenumbers - iw))] for iw in input_wavenumbers}
matched_reference_df = reference_df[list(matched_columns.values())]
matched_input_df = input_df.rename(columns=matched_columns)
print(f"Matched wavenumbers (input to reference): {matched_columns}")
matched_reference_df = matched_reference_df[matched_input_df.columns]
print(matched_reference_df)
X = matched_reference_df.to_numpy()
Y = matched_input_df.to_numpy()
coefficients = np.zeros((Y.shape[0], X.shape[0]))
for i in range(Y.shape[0]):
coefs, _, _, _ = np.linalg.lstsq(X.T, Y[i, :], rcond=None)
coefficients[i, :] = coefs
print("Coefficients shape:", coefficients.shape)
residuals = Y - reconstructed_spectra
se = residuals**2
n = Y.shape[1]
log_likelihood = []
for i in range(0,se.shape[0]):
var = variance(residuals[i,:])
ll = log(np.cumprod((np.sqrt(2*np.pi*var[i,:]))^(-1)*np.exp(-se[i,:]/(2*var[i,:]))))
log_likelihood.append(ll)
print("Log Likelihood:", log_likelihood)
I need suggestions on logliklihood calculations. I get either very high value of 7 or 8 digits or error due to underflow
ANy quick help is highly appreciated.