Latest in AI

Showing:reasoningResearchersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Google 於 Gemini 應用程式推出 Deep Think 功能，並向數學家開放 IMO 級 Gemini 2.5 完整版★ 85
Google DeepMind Blog234 days agoRelease
Google DeepMind has officially announced the rollout of the "Deep Think" feature for Google AI Ultra subscribers within the Gemini application. This new…
介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72
Hugging Face Blog276 days agoRelease
Writer, a leading provider of enterprise AI solutions, has officially announced the launch of its new "Palmyra-mini" model series on the Hugging Face platform…
Jupyter Agents：訓練 LLM 利用 Notebook 進行推理與自我修正★ 78
Hugging Face Blog277 days agoRelease
### Background and Core Concepts Traditional large language models (LLMs), when faced with complex mathematics, data analysis, or programming tasks, can…
NVIDIA 於 Hugging Face 開源發布 600 萬筆多語言推理數據集★ 78
Hugging Face Blog297 days agoRelease
NVIDIA has officially released a massive "Multi-Lingual Reasoning Dataset" containing 6 million samples on the Hugging Face platform. This significant…
TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75
Hugging Face Blog306 days agoRelease
Hugging Face has recently introduced a new benchmark called "TextQuests," designed to evaluate the performance of large language models (LLMs) in text-based…
回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75
Hugging Face Blog332 days agoRelease
### What is FutureBench? As large language models (LLMs) and AI agents have rapidly advanced, traditional static benchmarks (such as MMLU and GSM8K) face a…
Kimina-Prover：在大型形式化推理模型中應用測試時強化學習搜尋 (Test-time RL Search)★ 82
Hugging Face Blog339 days agoRelease
Hugging Face's AI-MO (AI Math Olympiad) team has officially published Kimina-Prover, a research paper demonstrating how "test-time reinforcement learning…
Hugging Face 發表 SmolLM3：輕量、多語言、長上下文的端側推理模型★ 80
Hugging Face Blog341 days agoRelease
Hugging Face has announced the release of a brand-new generation of lightweight open-source models — SmolLM3. As the latest member of the SmolLM family…
🐯 Liger GRPO 攜手 TRL：大幅降低 DeepSeek-R1 式強化學習訓練顯存與加速★ 82
Hugging Face Blog385 days agoNew Tool
Since the explosive rise of DeepSeek-R1, GRPO (Group Relative Policy Optimization) has become the most widely discussed reinforcement learning (RL) technique…
Gemini 2.5 迎來重大更新：Pro 版推出實驗性「Deep Think」深度思考模式，Flash 版性能再提升★ 85
Google DeepMind Blog390 days agoRelease
Google DeepMind today announced important updates to its flagship model series, Gemini 2.5. The most noteworthy highlight of this update is a brand-new…
Qwen-3 對話模板帶給我們的 4 個啟示：Hugging Face 深度解析★ 75
Hugging Face Blog410 days agoTutorial
With the release of Qwen-3, Hugging Face's official blog published an in-depth breakdown of its chat template. Chat templates are the critical bridge…
xAI 正式開放 Grok 3 與 Grok 3-mini API！主打推理與超低定價★ 85
TLDR AI (Buttondown)421 days agoRelease
Grok 3, the flagship AI model from xAI founded by Elon Musk, has finally officially opened its API access months after launch, and simultaneously surprised…
Google 發表 Gemini 2.5 Flash：主宰性價比邊界，首創可精確控制的「思考預算」★ 85
TLDR AI (Buttondown)422 days agoRelease
Google has officially released its new model Gemini 2.5 Flash, marking Google's comprehensive dominance over the cost-efficiency Pareto frontier on LMArena…
OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90
TLDR AI (Buttondown)423 days agoRelease
OpenAI recently held a live stream and published a blog post to officially announce the new reasoning model o3 and the lightweight reasoning model o4-mini…
OpenAI 推出全新主力模型 GPT 4.1：效能與實用性的新平衡★ 85
TLDR AI (Buttondown)425 days agoRelease
OpenAI has officially released its new flagship model GPT 4.1, positioned as the next-generation "workhorse" designed to give developers and enterprises the…
【AINews】平靜中的暗潮：OpenAI 記憶更新與 GPT-4.1 洩露、xAI 正式推出 Grok 3 API★ 75
TLDR AI (Buttondown)429 days agoRelease
Although AINews characterized these two days as "a calm day," in reality, tech giants and the open-source community remained full of undercurrents. First, on…
DeepCoder：Together 與 Agentica 推出達到 o3-mini 水準的 14B 完全開源程式碼推理模型★ 85
TLDR AI (Buttondown)431 days agoRelease
After DeepSeek R1 set off a wave of open-source reasoning models, the open-source community saw many projects attempting to replicate its path to success…
Hugging Face 發布 Open R1 第四次更新：開源推理模型訓練的最新進展與最佳化★ 85
Hugging Face Blog445 days agoRelease
Hugging Face's Open R1 project aims to fully open-source and replicate the training pipeline of DeepSeek-R1's reasoning model. In the latest fourth update…
Open R1：如何在本機使用 LM Studio 運行 OlympicCoder 進行程式開發★ 75
Hugging Face Blog451 days agoTutorial
Hugging Face has recently released an updated practical guide for the Open R1 project, walking developers through how to locally deploy and run "OlympicCoder"…
Open R1 第三次更新：Hugging Face 釋出開源推理模型與 GRPO 訓練優化細節★ 85
Hugging Face Blog459 days agoRelease
Since its launch, Hugging Face's Open R1 project has been dedicated to replicating the reasoning capabilities of DeepSeek-R1 in a fully open-source manner. In…
Open R1 更新第二彈：Hugging Face 複製 DeepSeek-R1 的最新進展與強化學習實踐★ 85
Hugging Face Blog489 days agoRelease
Hugging Face has officially published the second technical update (Update #2) for the Open R1 project, which aims to replicate DeepSeek-R1's reasoning model…
Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75
Hugging Face Blog495 days agoRelease
As large language model (LLM) technology has evolved, AI has transformed from a simple question-answering assistant into an "AI agent" capable of proactively…
Hugging Face 發布 Open-R1 首個更新：開源重現 DeepSeek-R1 的進展與挑戰★ 85
Hugging Face Blog497 days agoRelease
### Background and the Goals of the Open-R1 Project Since the release of DeepSeek-R1, its powerful reasoning capability and remarkably low training cost have…
Mini-R1：重現 DeepSeek-R1「頓悟時刻」的 RL 強化學習教學★ 85
Hugging Face Blog499 days agoTutorial
### Background and the Mystery of the "Aha Moment" Following the release of DeepSeek-R1, a wave of excitement around "Reasoning Models" swept the AI community…
Open-R1：Hugging Face 推出完全開源的 DeepSeek-R1 重現計劃★ 90
Hugging Face Blog502 days agoRelease
### Project Background: Recreating the Open-Source Miracle of DeepSeek-R1 The emergence of DeepSeek-R1 sent shockwaves through the global AI community…
讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75
Hugging Face Blog571 days agoRelease
This article from the Hugging Face blog introduces "The First Multilingual LLM Debate Competition." As large language models (LLMs) have rapidly advanced…
NuminaMath 如何贏得首屆 AIMO 進步獎（AI 數學奧林匹亞）並宣佈完整開源★ 80
Hugging Face Blog703 days agoRelease
### Background and Achievement The AI Mathematical Olympiad (AIMO) Progress Prize aims to advance AI models capable of solving Olympiad-level mathematical…
Hugging Face 推出 Open Chain of Thought (CoT) 排行榜：專注評估開源模型的推理與思考鏈能力★ 75
Hugging Face Blog782 days agoRelease
Hugging Face has announced the launch of the new "Open Chain of Thought (CoT) Leaderboard," a public platform specifically designed to evaluate and compare the…
Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75
Hugging Face Blog831 days agoRelease
Hugging Face has announced the launch of a new multimodal benchmark and leaderboard called "ConTextual," aimed at addressing the shortcomings of existing…
Hugging Face 推出 NPHardEval 排行榜：透過計算複雜度與動態更新揭示大型語言模型的推理能力★ 75
Hugging Face Blog863 days agoRelease
Hugging Face has announced the launch of the new **NPHardEval** leaderboard — a benchmark specifically designed to evaluate the reasoning capabilities of large…

← PreviousPage 2

Latest in AI

Google 於 Gemini 應用程式推出 Deep Think 功能，並向數學家開放 IMO 級 Gemini 2.5 完整版★ 85

介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72

Jupyter Agents：訓練 LLM 利用 Notebook 進行推理與自我修正★ 78

NVIDIA 於 Hugging Face 開源發布 600 萬筆多語言推理數據集★ 78

TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75

回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75

Kimina-Prover：在大型形式化推理模型中應用測試時強化學習搜尋 (Test-time RL Search)★ 82

Hugging Face 發表 SmolLM3：輕量、多語言、長上下文的端側推理模型★ 80

🐯 Liger GRPO 攜手 TRL：大幅降低 DeepSeek-R1 式強化學習訓練顯存與加速★ 82

Gemini 2.5 迎來重大更新：Pro 版推出實驗性「Deep Think」深度思考模式，Flash 版性能再提升★ 85

Qwen-3 對話模板帶給我們的 4 個啟示：Hugging Face 深度解析★ 75

xAI 正式開放 Grok 3 與 Grok 3-mini API！主打推理與超低定價★ 85

Google 發表 Gemini 2.5 Flash：主宰性價比邊界，首創可精確控制的「思考預算」★ 85

OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90

OpenAI 推出全新主力模型 GPT 4.1：效能與實用性的新平衡★ 85

【AINews】平靜中的暗潮：OpenAI 記憶更新與 GPT-4.1 洩露、xAI 正式推出 Grok 3 API★ 75

DeepCoder：Together 與 Agentica 推出達到 o3-mini 水準的 14B 完全開源程式碼推理模型★ 85

Hugging Face 發布 Open R1 第四次更新：開源推理模型訓練的最新進展與最佳化★ 85

Open R1：如何在本機使用 LM Studio 運行 OlympicCoder 進行程式開發★ 75

Open R1 第三次更新：Hugging Face 釋出開源推理模型與 GRPO 訓練優化細節★ 85

Open R1 更新第二彈：Hugging Face 複製 DeepSeek-R1 的最新進展與強化學習實踐★ 85

Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75

Hugging Face 發布 Open-R1 首個更新：開源重現 DeepSeek-R1 的進展與挑戰★ 85

Mini-R1：重現 DeepSeek-R1「頓悟時刻」的 RL 強化學習教學★ 85

Open-R1：Hugging Face 推出完全開源的 DeepSeek-R1 重現計劃★ 90

讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75

NuminaMath 如何贏得首屆 AIMO 進步獎（AI 數學奧林匹亞）並宣佈完整開源★ 80

Hugging Face 推出 Open Chain of Thought (CoT) 排行榜：專注評估開源模型的推理與思考鏈能力★ 75

Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75

Hugging Face 推出 NPHardEval 排行榜：透過計算複雜度與動態更新揭示大型語言模型的推理能力★ 75