Latest in AI

Showing:gemmaDevelopersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-regulation2 government-policy2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

DiffusionGemma: Google Launches High-Speed Open-Weight Gemma Diffusion Model★ 76
Simon Willison's Weblog3 days agoRelease
Simon Willison highlights Google’s new DiffusionGemma, an Apache 2 licensed open-weight Gemma model. He connects it to last year’s brief Gemini Diffusion preview, which he measured at 857 tokens per second. NVIDIA is currently hosting the model for free on its NIM cloud API, where Willison generated 2,409 tokens in 4.4 seconds, implying at least 500 tokens per second.
DiffusionGemma: 4x faster text generation★ 74
Google DeepMind Blog4 days agoRelease
Google’s DiffusionGemma is an Apache 2.0 experimental open model using text diffusion instead of standard autoregressive decoding. The 26B MoE model activates 3.8B parameters during inference and is designed for low-latency local workflows. Google claims up to 4x faster generation on dedicated GPUs, while noting that output quality is below standard Gemma 4 and production-quality use cases should still prefer Gemma 4.
DiffusionGemma: The Developer Guide — Google Developers Blog
r/LocalLLaMA top day4 days agoTutorial
Google has released a comprehensive developer guide for DiffusionGemma, a text-generation model that uses masked diffusion rather than autoregressive next-token prediction. Unlike standard Gemma models, DiffusionGemma iteratively denoises a fully masked sequence to produce output, enabling a fundamentally different generation paradigm. The guide targets developers looking to integrate or experiment with diffusion-based LLMs using Google's tooling.
DiffusionGemma: 4x Faster Text Generation★ 76
Hacker News (AI keywords)4 days agoRelease
Google released DiffusionGemma, a 26B MoE experimental open model using text diffusion instead of token-by-token autoregressive decoding. It can generate blocks of text in parallel, reaching up to 4x faster output on dedicated GPUs. The model targets local, speed-sensitive workflows, but Google says its output quality is below standard Gemma 4 and recommends Gemma 4 for quality-critical production use.
Watch agents fight: a live challenge to speed up Gemma 4 E4B inference on a single A10G
r/LocalLLaMA top day5 days agoBenchmark
A public HuggingFace Spaces dashboard hosts a live competition where AI agents race to optimize Gemma 4 E4B inference throughput on a single NVIDIA A10G GPU. The challenge gamifies ML inference engineering, letting anyone watch agents explore quantization and scheduling strategies in real time. Optimization recipes surfaced by the competition offer practical value for developers targeting single-GPU self-hosted Gemma 4 deployments.
Anyone seen benchmarks comparing Gemma 4 4-bit QAT vs. 8-bit standard quants?
r/LocalLLaMA top day5 days agoBenchmark
A r/LocalLLaMA user is looking for benchmarks comparing Gemma 4 4-bit QAT models, via Unsloth, against standard 8-bit non-QAT quantized models. They understand QAT is expected to preserve much of the BF16 baseline accuracy, but want hard numbers against traditional 8-bit PTQ. The post highlights scattered feedback but no clear head-to-head evaluation yet.
llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants
r/LocalLLaMA top day5 days agoRelease
The Reddit post links to ggml-org/llama.cpp Pull Request #24282, which adds MTP support for Gemma-4 E2B and E4B assistants. The submitter frames it as useful for tiny Gemma models on phones, low-end machines, Raspberry Pi, or similarly constrained devices. The post does not include benchmarks, merge status, or setup instructions, so it should be treated as a development signal rather than a finished release.
Thoughts on Gemma4 12B vs 26A4B: Which Is Better?
r/LocalLLaMA top day6 days agoOpinion
The post asks the LocalLLaMA community to compare Gemma4 12B and 26A4B, explicitly excluding the 31B model from discussion. The user is mainly interested in creative tasks, writing, and chatting, with coding treated as optional rather than central. No benchmarks or examples are provided, so the post is best read as a model-selection question about subjective quality and practical use.
Gemma 4 與開源模型成功的關鍵：為什麼基準測試分數不再是唯一指標★ 75
Interconnects (Nathan L.)72 days agoCommentary
This article takes a deep dive into the release of Google's latest open-source model Gemma 4, using it as an opportunity to re-examine the core factors that…
Google DeepMind 推出 T5Gemma：全新 Encoder-Decoder 架構 Gemma 模型系列★ 75
Google DeepMind Blog232 days agoRelease
Google DeepMind has announced a new open-source large language model series called "T5Gemma." This release represents another significant milestone for Google…
Google 推出 Gemma 3n：專為開發者社群打造的全新指南★ 70
Google DeepMind Blog232 days agoRelease
Google DeepMind officially launched Gemma 3n along with its developer guide. The Gemma series, as Google's open-weights model family, has long been a favorite…
Google DeepMind 推出 VaultGemma：全球最強大的差分隱私（Differential Privacy）開源大語言模型★ 80
Google DeepMind Blog234 days agoRelease
Google DeepMind has officially launched VaultGemma, currently the world's most capable large language model (LLM) trained entirely from scratch using…
歡迎 EmbeddingGemma：Google 全新高效嵌入模型上線 Hugging Face★ 75
Hugging Face Blog283 days agoRelease
Google has recently launched a new open-source text embedding model called "EmbeddingGemma" on the Hugging Face platform. This model is built on the…
Google 發布 Gemma 2 2B、安全分類器 ShieldGemma 與可解釋性工具 Gemma Scope★ 85
Hugging Face Blog683 days agoRelease
Google released a major update to the Gemma 2 family in late July 2024, comprising three core components: 1. **Gemma 2 2B**: A lightweight model with just 2.6B…
在 Hugging Face 中微調 Gemma 模型★ 80
Hugging Face Blog842 days agoTutorial
After Google released the Gemma family of open-source models (including 2B and 7B parameter versions), Hugging Face promptly published this practical…
歡迎 Gemma：Google 全新開源大語言模型登陸 Hugging Face★ 85
Hugging Face Blog844 days agoRelease
Google has officially released a new family of open-source large language models called "Gemma" — a series of lightweight, state-of-the-art open-source models…

Latest in AI

DiffusionGemma: Google Launches High-Speed Open-Weight Gemma Diffusion Model★ 76

DiffusionGemma: 4x faster text generation★ 74

DiffusionGemma: The Developer Guide — Google Developers Blog

DiffusionGemma: 4x Faster Text Generation★ 76

Watch agents fight: a live challenge to speed up Gemma 4 E4B inference on a single A10G

Anyone seen benchmarks comparing Gemma 4 4-bit QAT vs. 8-bit standard quants?

llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants

Thoughts on Gemma4 12B vs 26A4B: Which Is Better?

Gemma 4 與開源模型成功的關鍵：為什麼基準測試分數不再是唯一指標★ 75

Google DeepMind 推出 T5Gemma：全新 Encoder-Decoder 架構 Gemma 模型系列★ 75

Google 推出 Gemma 3n：專為開發者社群打造的全新指南★ 70

Google DeepMind 推出 VaultGemma：全球最強大的差分隱私（Differential Privacy）開源大語言模型★ 80

歡迎 EmbeddingGemma：Google 全新高效嵌入模型上線 Hugging Face★ 75

Google 發布 Gemma 2 2B、安全分類器 ShieldGemma 與可解釋性工具 Gemma Scope★ 85

在 Hugging Face 中微調 Gemma 模型★ 80

歡迎 Gemma：Google 全新開源大語言模型登陸 Hugging Face★ 85