Avataar AI has launched Varya, a video generation model built from Alibaba’s open Wan 2.2 model and distilled for faster, cheaper output. The company says Varya can generate 5-second 720p clips on an NVIDIA H200 in 45 seconds, versus 1,230 seconds for Wan 2.2. Avataar plans to release the model and training data through India’s AI Kosh portal while offering hosted access at about $0.005 per second.
A r/LocalLLaMA post claims Anthropic may be intentionally limiting Fable when users ask it to help build other LLMs. The source is a short Reddit post with screenshot context, not a formal benchmark or verified disclosure. Discussion centers on trust in hosted closed models, unclear safety boundaries, and why local or open-weight LLMs may be necessary for serious AI development work.
Omi Health’s founder says he fine-tuned NVIDIA Parakeet TDT 0.6B v2 for clinical speech and released Omi Med STT v1 under CC-BY-4.0. The runtime supports Mac, Windows, and Linux, auto-selecting MLX, NeMo, or GGUF/parakeet.cpp backends. In the author’s held-out medical benchmark, it reports 2.37% medical-WER and 145× realtime on local A10 compute.
Mistral AI announced Magistral, its first reasoning model family, with Magistral Small as a 24B open-weight Apache 2.0 model and Magistral Medium for enterprise use. The company emphasizes traceable multilingual reasoning, professional-domain use cases, and faster reasoning in Le Chat through Think mode and Flash Answers. Magistral Small is available on Hugging Face, while Magistral Medium is available in Le Chat preview and via La Plateforme API.
Mistral AI introduced Mistral 3, a new open model family under Apache 2.0. It includes Mistral Large 3, a 675B-parameter sparse MoE with 41B active parameters, plus Ministral 3 models at 3B, 8B, and 14B. The release targets frontier open-weight use, multimodal and multilingual workflows, enterprise customization, and efficient local or edge deployments.
Mistral introduced Devstral 2, a 123B coding model, and Devstral Small 2, a 24B variant for lighter deployment. The company reports 72.2% and 68.0% on SWE-bench Verified, respectively, with permissive open-source licensing. It also launched Mistral Vibe CLI, an open-source terminal agent for codebase exploration, multi-file edits, command execution, and IDE integration.
Mistral AI introduced Mistral Small 4 as the next major release in the Mistral Small family. It combines reasoning, multimodal, and agentic coding capabilities into one open model with configurable reasoning effort. The model uses a MoE architecture, supports a 256k context window and text-image inputs, and is available through Mistral API, AI Studio, Hugging Face, NVIDIA NIM, and common inference stacks.
Mistral Medium 3.5 is a 128B dense model in public preview, combining instruction-following, reasoning, and coding with a 256k context window. It becomes the default model for Le Chat and Mistral Vibe. Vibe now supports remote coding agents that run asynchronously in the cloud, while Le Chat adds Work mode for longer multi-step tasks across connected tools.
Mistral AI introduced Mistral 3, a new open model family including Mistral Large 3 and Ministral 3 models at 3B, 8B, and 14B sizes. Large 3 is a 675B-parameter sparse MoE model with 41B active parameters, while Ministral 3 targets local and edge use cases. The models are released under Apache 2.0 and are available through Mistral AI Studio, Hugging Face, Amazon Bedrock, and other platforms.
Mistral Small 4 is the next major release in the Mistral Small family, unifying Magistral-style reasoning, Pixtral-style multimodality, and Devstral-style coding agents. It uses a MoE architecture with 119B total parameters, 6B active parameters per token, a 256k context window, and configurable reasoning effort. The model is available via Mistral API, AI Studio, Hugging Face, open-source serving stacks, and NVIDIA deployment options.