Offline CPU Voice Loop for Ollama and LM Studio Agents
r/LocalLLaMA top day·3 days ago·New Tool
A r/LocalLLaMA post introduces an offline voice loop for talking to local models through Ollama, LM Studio, or vLLM.
The stack uses Silero VAD, Parakeet TDT 0.6B v3 STT, and Supertonic TTS 3, all running on CPU so GPU memory stays available for the LLM.
The author reports measured CPU-only benchmarks, agent integrations, cross-platform installers, and an MIT-licensed GitHub release.