Google DeepMind has released Gemini 3.5 Live Translate, bringing near real-time and naturally flowing voice translation to three major Google platforms. The feature integrates into Google AI Studio for developers, Google Translate for general users, and Google Meet for remote collaboration. The emphasis on naturalness — not just speed — marks a meaningful step forward for AI-powered multilingual communication.
Cohere has announced "Cohere Transcribe," a new state-of-the-art open-source speech recognition model. Designed to deliver highly accurate and efficient speech-to-text capabilities, it represents Cohere's expansion into open-source audio AI. The model aims to challenge existing industry benchmarks like OpenAI's Whisper by offering superior multilingual performance.
Cohere has partnered with RWS, a global leader in translation and localization services, to deliver high-performance AI language intelligence for enterprises. The collaboration integrates Cohere's multilingual models (like Command R) into RWS's platforms to provide culturally accurate translations. This partnership focuses on secure, enterprise-grade deployment and advanced multilingual Retrieval-Augmented Generation (RAG).
Harvey and ElevenLabs announced a partnership to bring ElevenLabs Text to Speech and Speech to Text into Harvey’s legal AI platform. The first phase will let Harvey deliver spoken answers in almost any language or dialect. Future plans mentioned include multilingual voice translation, voice mode, spoken trial simulations, tone customization, and related voice features.
ElevenLabs introduced Scribe v2 Realtime, a low-latency speech-to-text model built for live transcription, voice agents, meeting assistants, and real-time captions. The company says it transcribes in under 150 ms across several major languages and supports 90 languages. Key features include automatic language detection, VAD, manual commit, text conditioning, multiple audio formats, API access, ElevenLabs Agents integration, and enterprise compliance options.
ElevenLabs introduced Dubbing v2, a new AI dubbing model that preserves tone, pacing, delivery, and emotional intent from the original speaker. It supports more than 90 languages and uses sync-aware translation to make dubbed speech feel more natural. The product is available in ElevenCreative and ElevenProductions, while API access is coming soon.