Lemonade v10.7 marks a project-level shift toward working-group-driven development, with 19 contributors involved in the release. The update improves LMX-Omni virtual models for Open WebUI and OpenAI-compatible multimedia clients, introduces the `lemonade bench` CLI, and expands backend support. CUDA, Vulkan, llama.cpp, stable-diffusion.cpp, FastFlowLM, and vLLM are part of the broader push toward cross-vendor local AI performance.
Google has announced Gemini 3.5 Live Translate, a real-time voice-to-voice translation system that preserves the original speaker's tone, pacing, and pitch rather than producing flat synthetic output. The system embeds Google's SynthID watermarks into translated audio, enabling AI content provenance detection without affecting audio quality. This extends Google's Gemini Live multimodal API capabilities into cross-language communication scenarios such as meetings, live streams, and customer service.
Google introduced Gemma 4 12B, an open model aimed at running locally on laptops with 16GB of RAM. The model uses a new encoding scheme and token prediction to improve efficiency relative to its size. Its practical importance depends on real-world benchmarks, but it could lower the barrier for private, offline, and local multimodal AI workflows.
Google recently unveiled a brand-new "anything-to-anything" multimodal AI model — Gemini Omni — whose powerful cross-modal generation and transformation…
The mysterious AI startup Hark has announced the successful completion of a Series A funding round totaling $700 million (approximately NT$22 billion), capital…
In the latest issue of Latent Space AINews, the major announcements from Google I/O 2026 were covered in depth. Google demonstrated its formidable R&D and…
Google DeepMind has officially unveiled its latest flagship AI model, "Gemini Omni." This model represents a major breakthrough by Google in the field of…
Google DeepMind has unveiled a new initiative called "Gemini for Science" — a collection of AI tools and experiments designed to expand the scale and precision…
NVIDIA has officially launched a new lightweight multimodal model, "Nemotron 3 Nano Omni." This model is designed to deliver powerful multimodal intelligence…
The Technology Innovation Institute (TII) of the UAE has officially announced the launch of its new "Falcon Perception" model on the Hugging Face blog. As an…
IBM has officially launched its new lightweight multimodal model on Hugging Face — the Granite 4.0 3B Vision. With 3 billion (3B) parameters, this model is…
Hugging Face has published its Spring 2026 "State of Open Source AI" report, offering a comprehensive review of the explosive growth and paradigm shifts that…
Google DeepMind announced today (February 18, 2026) that its popular AI assistant application Gemini has officially integrated its most advanced music…
Google DeepMind has announced a major upgrade to its Gemini audio models, aimed at delivering a more natural, fluid, and low-latency voice interaction…
Google DeepMind has unveiled a new model called "Nano Banana Pro," which is also the Pro-tier image model of the Gemini 3 generation (Gemini 3 Pro Image…
Google DeepMind today officially unveiled its latest generation AI model family — Gemini 3 — and extended an invitation to developers worldwide, formally…
Google DeepMind officially unveiled its latest flagship AI model — Gemini 3 — in November 2025. This marks a new milestone for Google in the field of…
Google DeepMind has officially introduced SIMA 2 (Scalable Instructable Multiworld Agent 2). Compared to its predecessor, the most significant transformation…
Google DeepMind today announced that Gemini 2.5 Flash-Lite — its lightweight AI model that had previously been in preview — has officially transitioned to a…
Google DeepMind recently unveiled a new experimental AI tool called "Backstory," designed to help internet users deeply explore and understand the background…
Google DeepMind has officially announced the launch of Gemini Robotics 1.5, marking the formal entry of AI Agent technology into the physical world and…
Google DeepMind has officially launched the new dedicated "Gemini 2.5 Computer Use" model, which is now available in preview via API. This model is built on…
Hugging Face has recently released a major update for its innovative spreadsheet AI tool "AI Sheets," officially unlocking powerful image processing…
Google DeepMind has announced that its latest-generation model, Gemini 2.5, has achieved new breakthroughs in AI-driven audio dialog and audio generation. This…
Google DeepMind has officially released a preview of its new open model "Gemma 3n." This is a cutting-edge open model purpose-built for mobile devices and…
With the explosion of multimodal technology, Vision Language Models (VLMs) have evolved from laboratory research prototypes into core tools for enterprises and…
Google has officially launched Gemma 3, the next generation of its open-source large language model series — a major technical leap forward from Gemma 2. Gemma…
Cohere For AI (C4AI) has officially launched "Aya Vision," a series of open-source multimodal models (available in 8B and 32B parameter versions) designed…
In this case study, Prezi — the well-known company behind the non-linear presentation software of the same name — shares how it is embracing the "multimodal…
The Technology Innovation Institute (TII) of Abu Dhabi has officially released a new open-source model family on Hugging Face — Falcon 2 11B. This model, with…