How to Find 3 Local Maxima in Fourier Transform Pitch Detection Function?

74 Views Asked by At

I have a function that detects the three most dominant frequencies in an incoming microphone stream. I'm running into a problem where when I play an "E4" note (392 Hz) on my piano, it says that the fundamental frequency is B5 (996 Hz). There are occasionally some other issues like it saying that a C4 is a C5, but this one is glaring. When I plot a graph of the frequencies, it looks like the E is clearly the most dominant, but for some reason it still says B5.

def pitch_calculations(stream, CHUNK, RATE):
    # Read mic stream and then call struct.unpack to convert from binary data back to floats
    data = stream.read(CHUNK, exception_on_overflow=False)
    dataInt = np.array(struct.unpack(str(CHUNK) + 'h', data))
    
    # Apply a window function (Hamming) to the input data
    windowed_data = np.hamming(CHUNK) * dataInt

    # Using numpy fast Fourier transform to convert mic data into frequencies
    fft_result = np.abs(np.fft.fft(windowed_data)) * 2 / (11000 * CHUNK)
    freqs = np.fft.fftfreq(len(windowed_data), d=1.0 / RATE)
    
    # Find the indices of local maxima in the frequency spectrum
    localmax_indecies = argrelextrema(fft_result, np.greater)[0]
    
    # Get the magnitudes of the local maxima
    strong_freqs = fft_result[localmax_indecies]
    
    # Sort the magnitudes in descending order
    sorted_indices = np.argsort(strong_freqs)[::-1]
    
    # Get the indices of the three highest peaks
    top_indices = sorted_indices[:6]
    
    # Get the corresponding frequencies
    note_1_freq = abs(freqs[localmax_indecies[top_indices[0]]])
    note_2_freq = abs(freqs[localmax_indecies[top_indices[2]]])
    note_3_freq = abs(freqs[localmax_indecies[top_indices[4]]])
    
    return note_1_freq, note_2_freq, note_3_freq

Here is an image of my graph:

Fourier Graph of playing an "E4" note

0

There are 0 best solutions below