Subtitle Edit

the subtitle editor :)


Speech to Text

Subtitle Edit can automatically transcribe audio to text using Whisper-based speech recognition engines.

Speech to Text

Supported Engines

Engine Platform GPU Support
Whisper.cpp Windows, Linux, macOS CPU only
Whisper.cpp (cuBLAS) Windows NVIDIA CUDA
Whisper.cpp (Vulkan) Windows Vulkan GPU
Purfview’s Faster Whisper XXL Windows, Linux NVIDIA CUDA
Whisper CTranslate2 Windows, Linux, macOs CPU only
Const-me’s Whisper Windows DirectX
OpenAI Whisper All (Python required) NVIDIA CUDA
Chat LLM cpp Windows, Linux CPU/GPU

Engines and models are downloaded automatically on first use.

How to Use

  1. Open a video file in Subtitle Edit
  2. Go to Video → Speech to text (Whisper)…
  3. Select an Engine from the dropdown
  4. Select a Model (larger models = better accuracy but slower)
  5. Select the Language of the audio
  6. Optionally enable:
    • Translate to English — Translate non-English audio to English
    • Adjust timings — Post-process timing using waveform data
    • Post-processing — Fix casing, merge lines, add periods, etc.
  7. Click Transcribe

Models

Each engine has its own set of models. Common model sizes:

Models ending in .en are English-only and perform better for English audio.

Batch Mode

Transcribe multiple video files at once:

  1. Click Batch mode
  2. Add video files
  3. Click Transcribe
  4. Results are saved as .srt files next to the video files

Advanced Settings

Click the Advanced button to configure custom command-line arguments for the Whisper engine:

Post-Processing Settings

Click the Post-processing button to configure:

Console Log

The console log at the bottom shows real-time output from the Whisper process, useful for debugging issues.

Tips