Dockerized Nemotron 3.5 ASR: Better Multilingual Support & Streaming (4.5x CPU Speed)
r/LocalLLaMA top day·7 days ago·New Tool
A developer on Reddit shared a Dockerized implementation of Nemotron 3.5 ASR, migrating from Parakeet. The system supports over 40 languages and features a native streaming architecture that avoids full-file buffering. Using the onnxruntime-genai backend, it achieves 4.5x real-time speed on CPU, with CUDA support planned but untested.