Google DeepMind announced today (February 18, 2026) that its popular AI assistant application Gemini has officially integrated its most advanced music…
Vercel officially released the latest major version of its popular open-source library — AI SDK 6. As AI applications evolve from simple "chat boxes" into…
Google DeepMind has announced a major upgrade to its Gemini audio models, aimed at delivering a more natural, fluid, and low-latency voice interaction…
Google DeepMind has unveiled a new model called "Nano Banana Pro," which is also the Pro-tier image model of the Gemini 3 generation (Gemini 3 Pro Image…
Vercel has announced in its product changelog that it has officially added support for "Nano Banana Pro" — Google's Gemini 3 Pro Image model — to Vercel AI…
Google DeepMind today officially unveiled its latest generation AI model family — Gemini 3 — and extended an invitation to developers worldwide, formally…
Google DeepMind officially unveiled its latest flagship AI model — Gemini 3 — in November 2025. This marks a new milestone for Google in the field of…
Google DeepMind has officially introduced SIMA 2 (Scalable Instructable Multiworld Agent 2). Compared to its predecessor, the most significant transformation…
Google DeepMind officially announced the launch of a new "MedGemma" multimodal model within its open-source medical model series. This model represents the…
Google DeepMind today announced that Gemini 2.5 Flash-Lite — its lightweight AI model that had previously been in preview — has officially transitioned to a…
Google DeepMind recently unveiled a new experimental AI tool called "Backstory," designed to help internet users deeply explore and understand the background…
Google DeepMind has officially announced the launch of Gemini Robotics 1.5, marking the formal entry of AI Agent technology into the physical world and…
Google DeepMind has officially launched the new dedicated "Gemini 2.5 Computer Use" model, which is now available in preview via API. This model is built on…
Hugging Face has recently released a major update for its innovative spreadsheet AI tool "AI Sheets," officially unlocking powerful image processing…
Hugging Face's TRL (Transformer Reinforcement Learning) is a popular open-source library specifically designed for aligning language models (LLMs). In its…
Vercel has officially launched AI SDK 5, a major milestone for the open-source AI development toolkit built for web developers. As AI applications evolve from…
As large multimodal models (LMMs) have achieved breakthroughs in image and short-video understanding, the industry has gradually shifted its attention to the…
With the rapid development of vision-language models (VLMs) and multimodal AI, the amount of data required to train these models has grown explosively…
NVIDIA has partnered with Hugging Face to officially bring its latest lightweight vision-language model (VLM) — the **NVIDIA Llama Nemotron Nano VLM** — to the…
Google DeepMind has announced that its latest-generation model, Gemini 2.5, has achieved new breakthroughs in AI-driven audio dialog and audio generation. This…
Hugging Face recently launched an open-source project called nanoVLM, positioned as "the simplest repository for training Vision Language Models (VLMs) in pure…
Google DeepMind has officially released a preview of its new open model "Gemma 3n." This is a cutting-edge open model purpose-built for mobile devices and…
With the explosion of multimodal technology, Vision Language Models (VLMs) have evolved from laboratory research prototypes into core tools for enterprises and…
The Language Technologies department (BSC-LT) of the Barcelona Supercomputing Center (BSC) recently released a new open-source multimodal model on Hugging Face…
Google has officially launched Gemma 3, the next generation of its open-source large language model series — a major technical leap forward from Gemma 2. Gemma…
Cohere For AI (C4AI) has officially launched "Aya Vision," a series of open-source multimodal models (available in 8B and 32B parameter versions) designed…
Google has officially launched the PaliGemma 2 Mix model series — a new family of open-source instruction-tuned vision-language models (VLMs) now available on…
On January 24, 2025, Hugging Face announced that smolagents — its open-source library designed for building lightweight, high-performance AI agents — now…
Hugging Face has officially introduced the newest members of the SmolVLM family, pushing vision-language model (VLM) sizes even further down to 256M (256…
As multimodal large language models (such as GPT-4o, Gemini, and various open-source audio models) continue to proliferate, AI's ability to process audio has…