← Back

Parakeet Captions

Content NVIDIA NeMo WebSocket Python
🎙️

Problem

Cloud transcription adds latency. Local models sacrifice accuracy. For live events and real-time captioning, you need both speed and quality.

Solution

Self-hosted transcription with switchable engines. Run NVIDIA Parakeet for speed or Whisper for accuracy—switch on the fly based on your needs.

  • Dual ASR engines (Parakeet + Whisper)
  • Sub-second latency via WebSocket
  • System audio capture for Zoom/videos
  • Session persistence and export

Demo

Start the server, open the launcher, speak into your mic. Watch text appear in real-time. Toggle between engines to compare speed vs accuracy.

Start server
Live transcript

Run it

docker compose up -d
python client/launcher.py
# Opens at http://localhost:9848