brew install ffmpeg # takes a while!
git clone
cd whisper.cpp
cd models
./ base.en or ./ medium.en # see full model list here


  1. convert your audio file to a 16khz .wav file: ffmpeg -i SOURCE_FILE.wav -ar 16000 output.wav
  2. THEN you can do ./main -m models/ggml-medium.en.bin -f output.wav >> output.txt inside of the whisper.cpp directory, which pipes the transcription into output.txt
  • at a rate of about 3 minutes of input: 2 minutes to transcribe (for the medium - 769M param model)
  • or at a rate of about 12 minutes of input: 1 minute to transcribe (for the base - 74M param model)
