r/LocalLLaMA top dayJun 7, 2026, 11:02 AM/u/Apart_Boat9666

Dockerized Nemotron 3.5 ASR: Better Multilingual Support & Streaming (4.5x CPU Speed)

Original: Dockerized Nemotron 3.5 ASR — Switched from Parakeet, better multilingual support + streaming (4.5x realtime speed on cpu)

A developer released a Dockerized Nemotron 3.5 ASR pipeline with 40+ language support and 4.5x real-time CPU speed.

A developer on Reddit shared a Dockerized implementation of Nemotron 3.5 ASR, migrating from Parakeet. The system supports over 40 languages and features a native streaming architecture that avoids full-file buffering. Using the onnxruntime-genai backend, it achieves 4.5x real-time speed on CPU, with CUDA support planned but untested.

On Reddit's LocalLLaMA forum, a developer shared their hands-on experience migrating a speech recognition (ASR) pipeline from the original Parakeet to NVIDIA Nemotron 3.5 ASR, and released a Dockerized open-source project.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on r/LocalLLaMA top day →

Summaries are AI-generated; the original article is authoritative.