Google DeepMind has released Gemini 3.5 Live Translate, bringing near real-time and naturally flowing voice translation to three major Google platforms. The feature integrates into Google AI Studio for developers, Google Translate for general users, and Google Meet for remote collaboration. The emphasis on naturalness — not just speed — marks a meaningful step forward for AI-powered multilingual communication.
Mistral AI introduces Voxtral, a speech understanding model family with 24B and 3B variants under Apache 2.0. The models support long-context transcription, audio Q&A, summarization, multilingual detection, and function calling from voice. Mistral says Voxtral is competitive across transcription and audio understanding benchmarks, with API access starting at $0.001 per minute and local downloads available on Hugging Face.
Mistral AI introduced Voxtral TTS, its first text-to-speech model, targeting natural multilingual voice generation across nine languages. The 4B-parameter model supports voice adaptation from short references, emotional expressiveness, dialect handling, and low-latency streaming. It is available through API, Mistral Studio, and Le Chat, with open weights on Hugging Face under a non-commercial CC BY NC 4.0 license.