Google DeepMind has officially introduced SIMA 2 (Scalable Instructable Multiworld Agent 2). Compared to its predecessor, the most significant transformation…
Vercel's official Changelog has announced that Kimi K2 Thinking and Kimi K2 Thinking Turbo, the two latest reasoning models from Moonshot AI, are now…
Google DeepMind has officially announced the launch of the "AI for Math Initiative," a major program aimed at deeply integrating artificial intelligence into…
The International Mathematical Olympiad (IMO) has been held annually since 1959 and is the most prestigious and difficult mathematics competition for high…
Google DeepMind has announced that its latest reasoning model, "Gemini 2.5 Deep Think," has achieved gold-medal-level performance at the International…
Google DeepMind has officially announced the rollout of the "Deep Think" feature for Google AI Ultra subscribers within the Gemini application. This new…
Writer, a leading provider of enterprise AI solutions, has officially announced the launch of its new "Palmyra-mini" model series on the Hugging Face platform…
### Background and Core Concepts Traditional large language models (LLMs), when faced with complex mathematics, data analysis, or programming tasks, can…
NVIDIA has officially released a massive "Multi-Lingual Reasoning Dataset" containing 6 million samples on the Hugging Face platform. This significant…
Hugging Face has recently introduced a new benchmark called "TextQuests," designed to evaluate the performance of large language models (LLMs) in text-based…
### What is FutureBench? As large language models (LLMs) and AI agents have rapidly advanced, traditional static benchmarks (such as MMLU and GSM8K) face a…
Hugging Face's AI-MO (AI Math Olympiad) team has officially published Kimina-Prover, a research paper demonstrating how "test-time reinforcement learning…
Hugging Face has announced the release of a brand-new generation of lightweight open-source models — SmolLM3. As the latest member of the SmolLM family…
Since the explosive rise of DeepSeek-R1, GRPO (Group Relative Policy Optimization) has become the most widely discussed reinforcement learning (RL) technique…
Google DeepMind today announced important updates to its flagship model series, Gemini 2.5. The most noteworthy highlight of this update is a brand-new…
With the release of Qwen-3, Hugging Face's official blog published an in-depth breakdown of its chat template. Chat templates are the critical bridge…
Grok 3, the flagship AI model from xAI founded by Elon Musk, has finally officially opened its API access months after launch, and simultaneously surprised…
Google has officially released its new model Gemini 2.5 Flash, marking Google's comprehensive dominance over the cost-efficiency Pareto frontier on LMArena…
OpenAI recently held a live stream and published a blog post to officially announce the new reasoning model o3 and the lightweight reasoning model o4-mini…
OpenAI has officially released its new flagship model GPT 4.1, positioned as the next-generation "workhorse" designed to give developers and enterprises the…
Although AINews characterized these two days as "a calm day," in reality, tech giants and the open-source community remained full of undercurrents. First, on…
After DeepSeek R1 set off a wave of open-source reasoning models, the open-source community saw many projects attempting to replicate its path to success…
Hugging Face's Open R1 project aims to fully open-source and replicate the training pipeline of DeepSeek-R1's reasoning model. In the latest fourth update…
Hugging Face has recently released an updated practical guide for the Open R1 project, walking developers through how to locally deploy and run "OlympicCoder"…
Since its launch, Hugging Face's Open R1 project has been dedicated to replicating the reasoning capabilities of DeepSeek-R1 in a fully open-source manner. In…
Hugging Face has officially published the second technical update (Update #2) for the Open R1 project, which aims to replicate DeepSeek-R1's reasoning model…
As large language model (LLM) technology has evolved, AI has transformed from a simple question-answering assistant into an "AI agent" capable of proactively…
### Background and the Goals of the Open-R1 Project Since the release of DeepSeek-R1, its powerful reasoning capability and remarkably low training cost have…
### Background and the Mystery of the "Aha Moment" Following the release of DeepSeek-R1, a wave of excitement around "Reasoning Models" swept the AI community…
### Project Background: Recreating the Open-Source Miracle of DeepSeek-R1 The emergence of DeepSeek-R1 sent shockwaves through the global AI community…