A LocalLLaMA user tried to benchmark Google’s new fully local dictation app, Eloquent, against open ASR models such as Qwen3-ASR and NVIDIA Parakeet V3. The tester reported that roughly half of dictations returned only fragments, even during manual use. When Eloquent produced complete transcripts, its word error rate was competitive, but the missing-output behavior made the app unreliable for evaluation and practical use.
Apple announced CoreAI at WWDC, which the post frames as a possible future replacement for CoreML and an alternative to MLX, llama.cpp, and torch for optimized on-device inference. Models still need conversion through Python scripts, and current supported models appear mostly from mid-2025. No performance data is available yet; the author expects it may trail MLX on GPU, but Apple’s 20B on-device foundation model claim suggests larger app-bundled models could become possible.
Apple’s Core AI framework is positioned as a developer stack for deploying AI models directly inside apps on Apple silicon. The documentation describes Swift APIs, `.aimodel` assets, model specialization, caching, Xcode profiling, and debugging tools. It appears aimed at developers building low-latency, privacy-conscious on-device inference workflows, though the documentation is marked as preliminary beta information.
Google released new Gemma 4 checkpoints optimized with Quantization-Aware Training to preserve quality after compression. The release includes Q4_0 checkpoints and a mobile-focused quantization format that can reduce Gemma 4 E2B memory use to about 1GB, or below 1GB for a text-only configuration. The models are available through Hugging Face and supported across llama.cpp, Ollama, LM Studio, LiteRT-LM, Transformers.js, SGLang, vLLM, MLX, and Unsloth.
Google DeepMind published a blog post on November 20, 2025 titled "Introducing Nano Banana Pro." As the full content of the original article is not publicly…
This article, jointly published by IBM and Hugging Face, delves into the technical details and application scenarios of the brand-new ultra-lightweight model…
As generative AI advances rapidly, deploying massive models to resource-constrained edge devices — such as smartphones, smart hardware, and AI PCs — has become…
Hugging Face has announced the release of a brand-new generation of lightweight open-source models — SmolLM3. As the latest member of the SmolLM family…
The Technology Innovation Institute (TII) of the United Arab Emirates has officially released the "Falcon-Edge" model series on Hugging Face. This is a family…
Hugging Face has officially launched a lightweight vision language model (VLM) called **SmolVLM**, designed to bring powerful multimodal understanding…
In late 2022, Apple and Hugging Face jointly announced that Stable Diffusion had officially gained support for Apple Silicon's Core ML framework. This update…