How can I trim silence at the start and end of a recording(wav) in python?

69 Views Asked by At

I am trying to remove the silence at the start and end of this recording, so that I just have the voice in between, no need to remove any silence in between, just the start and end portions of the audio file.

Amplitude vs. Time Graph

import pyaudio
import wave
import numpy as np
import matplotlib.pyplot as plt


filename = "my_recd.wav"

# Load the saved audio file for plotting
waveform = wave.open(filename, "rb")
signal = waveform.readframes(-1)
signal = np.frombuffer(signal, dtype=np.int16)

time = np.linspace(0, len(signal) / 44100, num=len(signal))

plt.figure(figsize=(10, 4))
plt.plot(time, signal, color='blue', label='Audio Waveform')
plt.title("Audio Waveform")
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.grid(True)
plt.legend()
plt.show()

I first tried to locate the point after which audio starts exceeding the threshold, in this case,after trying a moving-averages with a threshold of 1100 and window-size 100, I was able to locate the starting for this audio and tried to apply that on a reversed array but that failed, so I am looking forward to a solution where I do not have to hardcode the threshold values. Just not willing to incorporate any speech recognition libraries into this. I want a minimalistic solution which solves this problem.

0

There are 0 best solutions below