Scipy-correlate: How to change datapoint lags into time lags?

49 Views Asked by At

I have a problem regarding the correlation of two light curves in my bachelor thesis. I use Scipio.signal.correlate to calculate the correlation. The light curves both have a different amount of data points and have different times. I think the first one has data between 2020 and 2020 and the second one before 2020. So I created data frames with pandas, added both curves into one frame and filled everything that said "NaN" up with zero. I correlated and normalized the curves and added the lags and correlation into one plot that looks like it could be correct. Now I have the problem, that the lags seem to be data lags, not time lags. So they say to me how much data points I have to postpone a curve to best match the other curve, not how much days I have to postpone. I found a calculation but I need the "slab rate" (?) to do this, which is not possible because my data points don't have the same distances. I found same other modules like Stingray but it needs bins at the same size and I think I can't resize because it would be a data loss. My idea was to subtract every point from the point with the biggest lag (would be 312) but then I would have only 534 datapoints while the correlation gives me 1067 (I still don't know why the correlation doubles the points...). I am running out of ideas. I think the following would be the important code block:

def ccf_values(series1, series2):
    p = series1
    q = series2
    p = (p - np.mean(p)) / (np.std(p) * len(p))
    q = (q - np.mean(q)) / (np.std(q))  
    c = scipy.signal.correlate(p, q, 'full')
    return c

#fermi_df=merged_df[merged_df['fermi_data']!='nan']
#print(fermi_df)

#ztf_lc.set_index('filter', inplace=True)
#test = pd.DataFrame({'r-data': [ztf_lc.loc['ZTF_r', 'fluxtot']]})
# 
ztf_r_frame=ztf_lc[ztf_lc['filter']=='ZTF_r']


#ccf_ielts = ccf_values(fermi_lc['y'], test.iloc[:,9])

ztf_zeit = ztf_r_frame['mjd']
fermi_zeit = fermi_lc.iloc[:,0]

ztf_df = pd.DataFrame({'Time': ztf_zeit, 'ztf_data': ztf_r_frame['mjd']})
fermi_df = pd.DataFrame({'Time': fermi_zeit, 'fermi_data': fermi_lc['y']})

merged_df = pd.merge(ztf_df, fermi_df, on='Time', how='outer')
merged_df.sort_values(by='Time', inplace=True)

merged_df['ztf_data'] = merged_df['ztf_data'].fillna(0)
merged_df['fermi_data'] = merged_df['fermi_data'].fillna(0)
zeitdifferenzen_df = merged_df['Time']

# Nehmen Sie die Zeitdifferenzen von der Spalte 'time' in sortierter_zeit_df.

zeitdifferenzen_df['Time'] = zeitdifferenzen_df - zeitdifferenzen_df.iloc[312]
#zeitdifferenzen_df = zeitdifferenzen_df[(zeitdifferenzen_df != 0).all(1)]

# Jetzt enthält zeitdifferenzen_df die Zeitdifferenzen von jedem Zeitpunkt zu Zeitpunkt 1.

#print(zeitdifferenzen_df)

ccf_ielts = ccf_values(merged_df['ztf_data'],merged_df['fermi_data'])
#ccf_ielts = ccf_values(merged_df['ztf_data'],merged_df['ztf_data'])

lags = signal.correlation_lags(len(merged_df['fermi_data']), len(merged_df['ztf_data']))
#lags = signal.correlation_lags(len(merged_df['ztf_data']), len(merged_df['ztf_data']))

def ccf_plot(lags, ccf):
    fig, ax = plt.subplots(figsize=(9, 6))
    ax.plot(lags, ccf)
    ax.axhline(-2/np.sqrt(23), color='red', label='5% confidence interval')
    ax.axhline(2/np.sqrt(23), color='red')
    ax.axvline(x=0, color='black', lw=1)
    ax.axhline(y=0, color='black', lw=1)
    ax.axhline(y=np.max(ccf), color='blue', lw=1, linestyle='--', label='highest +/- correlation')
    ax.axhline(y=np.min(ccf), color='blue', lw=1, linestyle='--')
    ax.set(ylim=[-1, 1])
    ax.set_title('Cross Correation IElTS Search and Registeration Count', weight='bold', fontsize=15)
    ax.set_ylabel('Correlation Coefficients', weight='bold', fontsize=12)
    ax.set_xlabel('Time Lags', weight='bold', fontsize=12)
    plt.legend()

#ccf_plot(zeitdifferenzen_df['time'], ccf_ielts)
ccf_plot(zeitdifferenzen_df['Time'], ccf_ielts)

merged_df is a data frame with 534 datapoints per row. I don't know if I left out an important info; if so, please let me know.

the plot: enter image description here

Thank you very much

0

There are 0 best solutions below