Having issues with autocorrelation of a lagged time series in python

34 Views Asked by At

I am trying to do a lagged autocorrelation of some data. The data is filled with random nans. The data is an NxN array where eke_array[i,i] is lag zero,...eke_array[i,i+n] is lag n. At some point this starts to return an autocorrelation greater than 1 especially after lag 40 but the first few lags are giving a reasonable correlation.

Here is the function I implemented:

def auto_corr(eke_array):

    nlag = np.shape(eke_array)[0]
    eke_lag_0 = []
    auto_store = []
    std_store = []
    for mylag in range(0,80):
        autocorr = 0
        numvalid = 0
        norm = 0
        std = 0
        for i in range(0, 105):
            if i+mylag > 104:
                break
            lag0 = eke_array[i,i]
            lag_itt = eke_array[i,i+mylag]
            if not np.isnan(lag_itt):
                if not np.isnan(lag0):
                    numvalid += 1

                    diff = lag0 - lag_itt
                    std = std + diff**2
                    autocorr = autocorr + lag0*lag_itt
        print(numvalid, mylag,i )
        auto_store.append((autocorr/numvalid))
        std_store.append(std / numvalid)

    for k in range(105):
        eke_lag_0.append(eke_array[k,k])
    eke_lag_0 = np.asarray(eke_lag_0)
    eke_lag_0 = eke_lag_0[~np.isnan(eke_lag_0)]
    norm = np.sum(eke_lag_0 ** 2) / np.size(eke_lag_0)

    std_store = np.sqrt(np.asarray(std_store))
    auto_store = auto_store/norm
    return std_store, auto_store, norm
1

There are 1 best solutions below

0
drwnthembirds On

I was normalizing in the wrong way. This is a cross correlation, so the normalization is the standard deviation of the two lagged time series.