I have a working video transcription pipeline working using a local OpenAI Whisper model. I would like to use the equivalent distilled model ("distil-small.en"), which is smaller and faster.
transcribe(self):
file = "/path/to/video"
model = whisper.load_model("small.en") # WORKS
model = whisper.load_model("distil-small.en") # DOES NOT WORK
transcript = model.transcribe(word_timestamps=True, audio=file)
print(transcript["text"])
However, I get an error that the model was not found:
RuntimeError: Model distil-small.en not found; available models = ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large-v2', 'large-v3', 'large']
I installed my dependencies in Poetry (which used pip under the hood) as follows:
[tool.poetry.dependencies]
python = "^3.11"
openai-whisper = "*"
transformers = "*" # distilled whisper models
accelerate = "*" # distilled whisper models
datasets = { version = "*", extras = ["audio"] } # distilled whisper models
The GitHub Distilled Whisper documentation appears to use a different approach to installing and using these models.
Is it possible to use a Distilled model as a drop-in replacement for a regular Whisper model?
load_modelwith a string parameter will only work with OpenAI's known list of models. If you want to use your own model, you will need to download it from the huggingface hub or elsewhere first. See: https://huggingface.co/distil-whisper/distil-small.en#running-whisper-in-openai-whisperYou can also see where OpenAI checks the string parameter of
load_modelthat it only checks the known models (as described in the error you showed)