Subtitle Edit

the subtitle editor :)


Third-Party Components

Subtitle Edit uses several third-party tools for features like video playback, audio extraction, and OCR. While Subtitle Edit includes built-in downloaders for these components, you might want to use a specific version or a custom build.

Subtitle Edit 5 also includes more downloadable AI components for speech-to-text, text-to-speech, and OCR. Prefer the in-app download prompts unless you need to install a specific build manually.

⚠️ Warning Subtitle Edit is tested with specific versions of these components. Using other versions is not officially supported and may cause instability.

Where are the files located?

Subtitle Edit stores these components in its Data Folder.

Tip: You can open the Data Folder directly from Subtitle Edit by pressing Ctrl+Alt+Shift+D (Windows/Linux) or Cmd+Alt+Shift+D (macOS).


Windows

Quick Reference Table

Component File(s) Destination Path
FFmpeg ffmpeg.exe, ffprobe.exe (optional) [Data Folder]/ffmpeg
MPV libmpv-2.dll [Data Folder] (root)
yt-dlp yt-dlp.exe [Data Folder] (root)
Tesseract tesseract.exe, tessdata/ folder [Data Folder]/Tesseract550
Whisper CPP whisper-cli.exe, Models/ folder [Data Folder]/SpeechToText/Cpp
Purfview Faster-Whisper XXL faster-whisper-xxl.exe, _models/ folder [Data Folder]/SpeechToText/Purfview-Faster-Whisper-XXL
Crisp ASR crispasr.exe, models/ folder [Data Folder]/CrispASR
Qwen3 ASR CPP qwen3-asr-cli.exe, models/ folder [Data Folder]/Qwen3ASR
Parakeet.cpp parakeet.exe, model folders [Data Folder]/parakeet.cpp
PaddleOCR paddleocr.exe, models/ folder [Data Folder]/OCR/PaddleOCR3-1
Qwen3 TTS (CrispASR) shares crispasr.exe + models/ from [Data Folder]/CrispASR; reference voices in voices/ [Data Folder]/TextToSpeech/Qwen3TtsCrispAsr (voices only)
Chatterbox TTS (CrispASR) shares crispasr.exe + models/ from [Data Folder]/CrispASR; reference voices in voices/ [Data Folder]/TextToSpeech/Chatterbox (voices only)
OmniVoice TTS omnivoice-tts.exe, omnivoice-codec.exe, models/, voices/ [Data Folder]/TextToSpeech/OmniVoice
Kokoro TTS kokoro-tts-server.exe, models/ [Data Folder]/TextToSpeech/KokoroTtsCpp

FFmpeg

Used for reading media info, extracting audio, and generating waveforms.

MPV Media Player (libmpv)

Used as a video player engine.

yt-dlp (Online Video Playback)

Used to enable mpv to stream online videos (e.g., YouTube, Vimeo, and many other sites) via Video > Open from URL.

Tip: Subtitle Edit can download yt-dlp automatically. When you use Video > Open from URL for the first time, you will be prompted to download it.

Tesseract OCR

Used for converting image-based subtitles (Sup/VobSub) to text.

Whisper CPP (Speech-to-Text)

Used for AI-based speech recognition.

Note: It is generally recommended to use the internal downloader for Whisper due to the complexity of model and library dependencies.

Purfview Faster-Whisper (GPU Speech-to-Text)

Used for GPU-accelerated AI-based speech recognition.

SE5 Speech-to-Text Engines

Subtitle Edit 5 can download additional ASR engines directly from the Speech to text window.

Use Speech to Text for the current engine list and workflow.

PaddleOCR

Used for OCR of image-based subtitles.

Local Text-to-Speech Engines

Subtitle Edit 5 can download local TTS servers and models from the Text to speech window.

Use Text to Speech for engine-specific options.


Linux

FFmpeg

Used for reading media info, extracting audio, and generating waveforms.

MPV Media Player (libmpv)

Used as a video player engine.

yt-dlp (Online Video Playback)

Used to enable mpv to stream online videos via Video > Open from URL.

Tip: Subtitle Edit can download yt-dlp automatically when you use Video > Open from URL for the first time.

Tesseract OCR

Used for converting image-based subtitles (Sup/VobSub) to text.

Whisper CPP (Speech-to-Text)

Used for AI-based speech recognition.

Purfview Faster-Whisper (GPU Speech-to-Text)

Used for GPU-accelerated AI-based speech recognition.

SE5 Speech-to-Text, OCR, and TTS Engines

The same data-folder layout is used on Linux. Prefer the in-app downloaders for Crisp ASR, Qwen3 ASR, Parakeet.cpp, PaddleOCR, Qwen3 TTS (CrispASR), Chatterbox TTS (CrispASR), OmniVoice TTS, and Kokoro TTS because the required files differ by build and model.


macOS

FFmpeg

Used for reading media info, extracting audio, and generating waveforms.

MPV Media Player (libmpv)

Used as a video player engine.

yt-dlp (Online Video Playback)

Used to enable mpv to stream online videos via Video > Open from URL.

Tip: Subtitle Edit can download yt-dlp automatically when you use Video > Open from URL for the first time.

Tesseract OCR

Used for converting image-based subtitles (Sup/VobSub) to text.

Whisper CPP (Speech-to-Text)

Used for AI-based speech recognition.

SE5 Speech-to-Text, OCR, and TTS Engines

Some newer local engines are platform-specific or model-specific. Use the in-app downloaders where available, and check Speech to Text, Text to Speech, and OCR for current engine notes.