Cohere’s Jay Alammar announced the official release of North Mini Code after early community feedback from r/LocalLLaMA. Weights are available on Hugging Face, including an fp8 version, and the model can be tried for free through OpenCode. For vLLM deployment, Cohere recommends using vLLM main for now and installing cohere_melody for accurate response parsing, while noting community requests for quantization and llama.cpp support.
CohereLabs’ North Mini Code 1.0 appears to have moved from early access to final release, with weights available on Hugging Face. The Reddit post describes it as a 30B A3B coding model. Its Artificial Analysis overall score of 28 trails Qwen 3.6 35B at 43, but its coding index score of 33 is close to Qwen’s 35 and above Gemma 4 26B’s 22.
Microsoft announced MAI-Thinking-1, a 35B reasoning model available to select early partners, and MAI-Code-1-Flash, a 5B coding model rolling out to GitHub Copilot individual users in VS Code. Simon Willison highlights their relatively small parameter counts and Microsoft's claim that MAI-Thinking-1 was preferred to Sonnet 4.6 in internal blind evaluations. He also questions what Microsoft's clean and appropriately licensed training data claims mean in practice.