Latest in AI

Showing:low-latencyDevelopersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

DiffusionGemma: 4x faster text generation★ 74
Google DeepMind Blog4 days agoRelease
Google’s DiffusionGemma is an Apache 2.0 experimental open model using text diffusion instead of standard autoregressive decoding. The 26B MoE model activates 3.8B parameters during inference and is designed for low-latency local workflows. Google claims up to 4x faster generation on dedicated GPUs, while noting that output quality is below standard Gemma 4 and production-quality use cases should still prefer Gemma 4.
NVIDIA Accelerates Google DeepMind’s DiffusionGemma for Local AI
NVIDIA Blog4 days agoRelease
Google DeepMind released DiffusionGemma, an experimental open model built for fast text generation. NVIDIA says it optimized the model for GeForce RTX GPUs, RTX PRO platforms, and DGX Spark systems. Instead of generating text one word at a time, DiffusionGemma produces multiple words in parallel to reduce latency for single-user workloads.
Voxtral TTS: Open-Weights, Low-Latency Text-to-Speech from Mistral AI★ 78
Mistral AI News6 days agoRelease
Mistral AI introduced Voxtral TTS, its first text-to-speech model, focused on realistic multilingual voice generation. The 4B-parameter model supports nine languages, quick voice adaptation from short references, and low-latency streaming for voice agents. Mistral says human evaluations show stronger naturalness than ElevenLabs Flash v2.5, with API access, Studio testing, Le Chat access, and open weights on Hugging Face.
Google DeepMind 推出 Gemini 3.1 Flash Live：讓語音 AI 更自然、更可靠★ 85
Google DeepMind Blog80 days agoRelease
Google DeepMind has officially unveiled its latest voice model, "Gemini 3.1 Flash Live." This model is positioned to deliver lower-latency, higher-precision…
Google 推出 Gemini 3.1 Flash-Lite：專為大規模智慧應用打造，史上最快且最具成本效益的 Gemini 3 模型★ 78
Google DeepMind Blog103 days agoRelease
Google DeepMind has officially released Gemini 3.1 Flash-Lite, the fastest and most cost-efficient version within its latest generation Gemini 3 series of…