Latest in AI

Showing:open-modelsDevelopersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

olmo-eval: An Evaluation Workbench for the Model Development Loop
Hugging Face Blog2 days agoNew Tool
The Hugging Face Blog post announces olmo-eval, described as an evaluation workbench for the model development loop. Based on the title alone, the project appears focused on helping teams evaluate models during iterative development rather than only after release. No article body was provided, so specific features, supported benchmarks, integrations, metrics, or usage details cannot be confirmed.
[AINews] Open Models, Model Labs vs Agent Labs, and the Untrainable★ 72
Latent Space3 days agoCommentary
This AINews issue uses Sarah Guo’s essay as a lens for current AI industry debates: where open models matter, how agent labs differ from model labs, and what cannot be trained away. It also recaps discourse around Anthropic Fable/Mythos, Fable 5’s capabilities, Google’s DiffusionGemma, and maturing agent infrastructure. The central takeaway is that durable value may lie in integration, customer translation, maintenance, and intent rather than model scores alone.
Upgrading agentic coding capabilities with the new Devstral models★ 72
Mistral AI News6 days agoRelease
Mistral AI announced two Devstral updates focused on agentic coding workflows: Devstral Small 1.1 and Devstral Medium. Devstral Small 1.1 remains a 24B Apache 2.0 open model and reaches 53.6% on SWE-Bench Verified. Devstral Medium reaches 61.6%, is available through Mistral’s API, and supports private deployment and custom finetuning for enterprises.
Mistral AI partners with NVIDIA to accelerate open frontier models★ 74
Mistral AI News6 days agoBusiness
Mistral AI announced it is a founding member of the NVIDIA Nemotron Coalition, a global initiative for open frontier foundation models. The partnership combines Mistral AI’s model architecture, training techniques, multimodal capabilities, and enterprise fine-tuning tools with NVIDIA compute, development tools, and synthetic data pipelines. The coalition’s first initiative is a DGX Cloud-trained base model that will support the upcoming NVIDIA Nemotron 4 family and be open-sourced for specialization.
Reve 2 and Ideogram 4: Layouts in Imagegen
Latent Space10 days agoRelease
Latent Space’s roundup frames image composition as a major barrier now being tackled by layout-aware image models. Reve 2.0 emphasizes precise generation and editing with layouts, while Ideogram 4.0 uses bounding boxes tied to region descriptions. The issue also covers MAI-Thinking-1, Gemma 4 12B, open audio models, agent execution layers, and model-routing cost debates.
Google's Gemma 4 12B is designed to run on 16GB RAM laptops
Ars Technica AI11 days agoRelease
Google introduced Gemma 4 12B, an open model aimed at running locally on laptops with 16GB of RAM. The model uses a new encoding scheme and token prediction to improve efficiency relative to its size. Its practical importance depends on real-world benchmarks, but it could lower the barrier for private, offline, and local multimodal AI workflows.
Open and closed models are on different exponentials
Interconnects (Nathan L.)13 days agoCommentary
Nathan L. argues that open and closed models are developing along different exponential curves. The key question is whether marginal gains in model intelligence translate into practical value. Some use cases may reward small capability improvements, while others may not benefit proportionally from additional intelligence.
Some ideas for what comes next, May 2026
Interconnects (Nathan L.)19 days agoCommentary
Nathan Lambert argues that 2026 AI progress is becoming higher-stakes, with model capabilities, work patterns, economics, and real-world risks all escalating. He says open models still lack a true Claude Code and Opus 4.5-style agent moment, and Gemini has no clear competitor to Claude Code or Codex yet. The essay also tracks Mythos, American open-model momentum, frontier-lab competition, and mounting intervention from governments and other power structures.