How to extract the rttm format file from .wav audio file

556 Views Asked by At

I would like to extract the .rttm file for an input .wav audio file in python

def extract_rttm_file(wav_path):
  """Extracts the .rttm file from the converted wav file.

  Args:
    wav_path: The path to the converted wav file.

  Returns:
    The path to the .rttm file.
  """

  output_path = os.path.splitext(wav_path)[0] + ".rttm"
  subprocess.call(["sox", wav_path, "-rttm", output_path])
  return output_path`

I tried the above code but it doesn't ouput the rttm file

1

There are 1 best solutions below

0
Gardner Bickford On

You can use pyanote-audio to do speaker diarization in python. See the speaker-diarization model on huggingface for more info.

Example:

# 1. visit hf.co/pyannote/speaker-diarization and accept user conditions
# 2. visit hf.co/pyannote/segmentation and accept user conditions
# 3. visit hf.co/settings/tokens to create an access token
# 4. instantiate pretrained speaker diarization pipeline
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/[email protected]",
                                    use_auth_token="ACCESS_TOKEN_GOES_HERE")


# apply the pipeline to an audio file
diarization = pipeline("audio.wav")

# dump the diarization output to disk using RTTM format
with open("audio.rttm", "w") as rttm:
    diarization.write_rttm(rttm)