On Whisper API, when I try to use a python script for transcribing audio files in bulk, I can't get the correct response_format ('srt' or 'vtt') work

Question

On Whisper API, when I try to use a python script for transcribing audio files in bulk, I can't get the correct response_format ('srt' or 'vtt') work

1.4k Views Asked by waghler At 05 March 2023 at 20:26

I'm using this code for connecting to Whisper API and transcribe in bulk all mp3 in a folder to both srt and vtt:

import requests
import os
import openai

folder_path = "/content/audios/"
def transcribe_and_save(file_path, format):
    url = 'https://api.openai.com/v1/audio/transcriptions'
    headers = {'Authorization': 'Bearer MyToken'}
    files = {'file': open(file_path, 'rb'), 
            'model': (None, 'whisper-1'),
            'response_format': format}
    response = requests.post(url, headers=headers, files=files)
    output_path = os.path.join(folder_path, os.path.splitext(filename)[0] + '.' + format)
    with open(output_path, 'w') as f:
        f.write(response.content.decode('utf-8'))

for filename in os.listdir(folder_path):
    if filename.endswith('.mp3'):
        file_path = os.path.join(folder_path, filename)
        transcribe_and_save(file_path, 'srt')
        transcribe_and_save(file_path, 'vtt')
else:
    print('mp3s not found in folder')

When I use this code, I'm getting the following error:

"error": {
    "message": "1 validation error for Request\nbody -> response_format\n  value is not a valid enumeration member; permitted: 'json', 'text', 'vtt', 'srt', 'verbose_json' (type=type_error.enum; enum_values=[<ResponseFormat.JSON: 'json'>, <ResponseFormat.TEXT: 'text'>, <ResponseFormat.VTT: 'vtt'>, <ResponseFormat.SRT: 'srt'>, <ResponseFormat.VERBOSE_JSON: 'verbose_json'>])",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }

I've tried with different values, but either don't work or I'm only receiving the transcription as a object in plain text, but no srt or vtt. I'm expecting to get srt and vtt files in the same folder as where audios are

Thanks, Javi

Original Q&A

There are 3 best solutions below

**Thomasssb1** · Answer 1 · 2023-03-05T20:33:10.540000

I am not sure about the whisper api, but you seem to be using an already existing python function as a parameter name. Perhaps this could be a reason why it is not working, as the function format is being used when calling the endpoint instead of the parameter you passed in.

Try changing the parameter name to something other than format and change the value being used for response_format.

**waghler** · Answer 2 · 2023-03-07T07:12:02.377000

I've found the solution, the problem was in one of the parameters 'response_format': (None, output_format):

def transcribe_and_save(file_path, output_format):
    url = 'https://api.openai.com/v1/audio/transcriptions'
    headers = {'Authorization': 'Bearer myToken'}
    files = {'file': open(file_path, 'rb'),
             'model': (None, 'whisper-1'),
             'response_format': (None, output_format)}
    response = requests.post(url, headers=headers, files=files)
    output_path = os.path.join(folder_path, os.path.splitext(os.path.basename(file_path))[0] + '.' + output_format)
    with open(output_path, 'w') as f:
        f.write(response.content.decode('utf-8'))

for filename in os.listdir(folder_path):
    if filename.endswith('.mp3'):
        file_path = os.path.join(folder_path, filename)
        transcribe_and_save(file_path, 'srt')
        transcribe_and_save(file_path, 'vtt')
else:
    print('mp3s not found in folder')

**dazzafact** · Answer 3 · 2023-03-21T15:54:29.603000

Here's a working Solution for single files:

import requests
import os

OPENAI_API_KEY = "123xyzxyzxyzxyzxyzxyzxyzxyz"

token = f"Bearer {OPENAI_API_KEY}"

url = "https://api.openai.com/v1/audio/transcriptions"
model_name ="whisper-1"

headers ={
    "Authorization": token,
    "Content-Type": "multipart/form-data"
}

file_path ="1.mp3"
with open(file_path,"rb") as file:
    file_content = file.read()

payload = {
    "name": os.path.basename(file_path),
    "response_format": "json",
    "prompt": "transcribe this Chapter",
    "language": "de",
    "model": model_name
}

files = {
    "file": (os.path.basename(file_path), file_content, "audio/mp3")
}

response = requests.post(url, headers=headers, data=payload, files=files)


print(response.text)

On Whisper API, when I try to use a python script for transcribing audio files in bulk, I can't get the correct response_format ('srt' or 'vtt') work

There are 3 best solutions below

Related Questions in PYTHON

Related Questions in SYNTAX-ERROR

Related Questions in SRT

Related Questions in WEBVTT

Related Questions in OPENAI-WHISPER

Trending Questions

Popular # Hahtags

Popular Questions