Latest in AI

Showing:ResearchersClaudeClear ×

🔥 Trending today

open-source3 anthropic3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75
Hugging Face Blog307 days agoRelease
Hugging Face has recently introduced a new benchmark called "TextQuests," designed to evaluate the performance of large language models (LLMs) in text-based…
Replicate 推出遠端 MCP 伺服器：可在 Claude、Cursor 與 VS Code 中直接探索與運行模型★ 75
Replicate Blog309 days agoNew Tool
Replicate has officially launched a remote MCP (Model Context Protocol) server. MCP is an open standard created by Anthropic that enables large language models…
用 Python 實作 MCP 伺服器：結合 Gradio 打造 AI 虛擬試穿購物助手★ 75
Hugging Face Blog319 days agoTutorial
As the Model Context Protocol (MCP) proposed by Anthropic gradually becomes the open standard for connecting large language models (LLMs) with external tools…
Model Context Protocol (MCP) 詳解：常見問答集 (FAQ)★ 85
Vercel Changelog324 days agoTutorial
As AI applications become more widespread, how to allow large language models (LLMs) to securely and efficiently access enterprise internal data or external…
透過 MCP 搜尋百萬個 GitHub 儲存庫：Vercel 推出全新 AI 協定工具★ 80
Vercel Changelog332 days agoNew Tool
Vercel has announced a major update to its AI development tooling, launching a new service based on the Model Context Protocol (MCP) that allows developers to…
回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75
Hugging Face Blog333 days agoRelease
### What is FutureBench? As large language models (LLMs) and AI agents have rapidly advanced, traditional static benchmarks (such as MMLU and GSM8K) face a…
Gradio MCP 伺服器迎來五大重大改進：讓 AI 代理更輕鬆調用 Gradio 應用★ 80
Hugging Face Blog333 days agoRelease
The Model Context Protocol (MCP) is an open standard introduced by Anthropic, designed to allow AI assistants (such as Claude) to interact securely and…
ScreenEnv：部署你的全端桌面 AI 代理（Desktop Agent）環境★ 82
Hugging Face Blog340 days agoNew Tool
With the rise of Anthropic's Claude 3.5 Sonnet "Computer Use" and various GUI-oriented multimodal models, "desktop agents" have become one of the hottest areas…
Hugging Face 推出官方 MCP 伺服器：讓 AI 助理直接存取 Hub 模型與資料集★ 82
Hugging Face Blog340 days agoNew Tool
Hugging Face has officially announced the launch of its dedicated MCP (Model Context Protocol) server — a major step in ecosystem integration. The Model…
用 Gradio MCP 伺服器為你的大型語言模型（LLM）升級技能★ 82
Hugging Face Blog341 days agoRelease
With Anthropic's introduction of the Model Context Protocol (MCP) open standard, the way AI agents connect to external tools and data sources has become…
立即上手 AI：實用快速指南 (Ethan Mollick 著)★ 85
One Useful Thing (Mollick)356 days agoTutorial
University of Pennsylvania Wharton School professor Ethan Mollick recently published an extremely practical AI quick guide, "Using AI Right Now: A Quick…
在 Python 中打造微型 Agent：用約 70 行程式碼實現 MCP 驅動的智慧代理★ 80
Hugging Face Blog388 days agoTutorial
Hugging Face recently published a highly practical technical tutorial demonstrating how to build a fully functional miniature AI agent in just around 70 lines…
Vercel 宣布支援部署 MCP (Model Context Protocol) 伺服器，輕鬆構建 AI Agent 工具鏈★ 85
Vercel Changelog403 days agoRelease
Vercel has officially announced support for deploying MCP (Model Context Protocol) servers. This update allows developers to use Vercel's Serverless…
性格與說服力：從 AI 的「迎合效應」中學習★ 75
One Useful Thing (Mollick)409 days agoOpinion
Wharton School professor Ethan Mollick, in his latest article "Personality and Persuasion," delves into AI's persuasive power and the psychological mechanisms…
如何使用 Gradio 建立 MCP 伺服器★ 80
Hugging Face Blog411 days agoTutorial
Since Anthropic introduced the Model Context Protocol (MCP) open standard, connecting large language models (LLMs) to external tools has never been easier. The…
Tiny Agents：用 50 行程式碼打造支援 MCP 的輕量級 AI Agent★ 80
Hugging Face Blog416 days agoTutorial
In this Hugging Face blog post, the team demonstrates how to implement a fully functional, lightweight AI agent (referred to as a "Tiny Agent") that supports…
介紹 HELMET：全面評估長文本語言模型（Long-context LLMs）的新一代基準測試★ 80
Hugging Face Blog425 days agoRelease
### Background and Pain Points: Moving Beyond the Overly Simple "Needle in a Haystack" Test In recent years, the context window length supported by large…
Google 發表 Agent2Agent (A2A) 協定：攜手 Anthropic MCP 打造跨 Agent 協同新標準★ 85
TLDR AI (Buttondown)431 days agoRelease
At the 2025 Google Cloud Next conference, Google dropped two bombshells regarding the AI Agent ecosystem. The CEOs of Google and Google DeepMind jointly…
Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75
Hugging Face Blog496 days agoRelease
As large language model (LLM) technology has evolved, AI has transformed from a simple question-answering assistant into an "AI agent" capable of proactively…
Hugging Face 輕量級 Agent 框架 smolagents 正式支援視覺語言模型 (VLM)！★ 80
Hugging Face Blog507 days agoRelease
On January 24, 2025, Hugging Face announced that smolagents — its open-source library designed for building lightweight, high-performance AI agents — now…
Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85
Hugging Face Blog531 days agoRelease
Hugging Face officially launched a lightweight AI agent development framework called `smolagents` at the end of 2024. The core philosophy of this tool is "Code…
重新思考阿拉伯語大模型評估：AraGen 基準測試與 3C3H 評估框架上線 Hugging Face
Hugging Face Blog558 days agoRelease
### Background and Challenges: The Difficulty of Evaluating Non-English LLMs In the current landscape of large language model (LLM) development, evaluating…
讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75
Hugging Face Blog572 days agoRelease
This article from the Hugging Face blog introduces "The First Multilingual LLM Debate Competition." As large language models (LLMs) have rapidly advanced…
Hugging Face 與 Atla 推出「Judge Arena」：評估 LLM 作為裁判能力的全新基準測試★ 80
Hugging Face Blog573 days agoRelease
As large language models (LLMs) have rapidly advanced, traditional static benchmarks (such as MMLU) have increasingly faced saturation and gaming problems. As…
評估驅動開發（Eval-driven development）：更快打造更好的 AI 應用★ 80
Vercel Changelog605 days agoOpinion
As generative AI applications become more widespread, one of the biggest challenges developers face is the "non-deterministic" output of large language models…
LAVE：在 Docmatix 上使用 LLM 進行零樣本 VQA 評估——我們還需要微調嗎？★ 75
Hugging Face Blog690 days agoPaper
### Background and Challenges Document Visual Question Answering (DocVQA) is an important application of multimodal AI, requiring models to simultaneously…
BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80
Hugging Face Blog727 days agoRelease
As large language models (LLMs) have made tremendous strides in code generation, the long-standing industry gold standard — the HumanEval benchmark — has…
Replicate Intelligence #1：動手實作 Llama 3、開源智慧眼鏡與透過字典學習控制語言模型★ 75
Replicate Blog752 days agoCommentary
This Replicate technical digest (Intelligence #1) compiles three of the most talked-about technical breakthroughs and open-source projects in the AI community…
推出 LiveCodeBench 排行榜：全面且無污染的程式碼大語言模型評估★ 75
Hugging Face Blog790 days agoRelease
As code large language models (Code LLMs) develop rapidly, fairly and accurately evaluating their capabilities has become a major challenge. Traditional…
Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75
Hugging Face Blog832 days agoRelease
Hugging Face has announced the launch of a new multimodal benchmark and leaderboard called "ConTextual," aimed at addressing the shortcomings of existing…

← PreviousPage 6Next →

Latest in AI

TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75

Replicate 推出遠端 MCP 伺服器：可在 Claude、Cursor 與 VS Code 中直接探索與運行模型★ 75

用 Python 實作 MCP 伺服器：結合 Gradio 打造 AI 虛擬試穿購物助手★ 75

Model Context Protocol (MCP) 詳解：常見問答集 (FAQ)★ 85

透過 MCP 搜尋百萬個 GitHub 儲存庫：Vercel 推出全新 AI 協定工具★ 80

回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75

Gradio MCP 伺服器迎來五大重大改進：讓 AI 代理更輕鬆調用 Gradio 應用★ 80

ScreenEnv：部署你的全端桌面 AI 代理（Desktop Agent）環境★ 82

Hugging Face 推出官方 MCP 伺服器：讓 AI 助理直接存取 Hub 模型與資料集★ 82

用 Gradio MCP 伺服器為你的大型語言模型（LLM）升級技能★ 82

立即上手 AI：實用快速指南 (Ethan Mollick 著)★ 85

在 Python 中打造微型 Agent：用約 70 行程式碼實現 MCP 驅動的智慧代理★ 80

Vercel 宣布支援部署 MCP (Model Context Protocol) 伺服器，輕鬆構建 AI Agent 工具鏈★ 85

性格與說服力：從 AI 的「迎合效應」中學習★ 75

如何使用 Gradio 建立 MCP 伺服器★ 80

Tiny Agents：用 50 行程式碼打造支援 MCP 的輕量級 AI Agent★ 80

介紹 HELMET：全面評估長文本語言模型（Long-context LLMs）的新一代基準測試★ 80

Google 發表 Agent2Agent (A2A) 協定：攜手 Anthropic MCP 打造跨 Agent 協同新標準★ 85

Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75

Hugging Face 輕量級 Agent 框架 smolagents 正式支援視覺語言模型 (VLM)！★ 80

Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85

重新思考阿拉伯語大模型評估：AraGen 基準測試與 3C3H 評估框架上線 Hugging Face

讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75

Hugging Face 與 Atla 推出「Judge Arena」：評估 LLM 作為裁判能力的全新基準測試★ 80

評估驅動開發（Eval-driven development）：更快打造更好的 AI 應用★ 80

LAVE：在 Docmatix 上使用 LLM 進行零樣本 VQA 評估——我們還需要微調嗎？★ 75

BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80

Replicate Intelligence #1：動手實作 Llama 3、開源智慧眼鏡與透過字典學習控制語言模型★ 75

推出 LiveCodeBench 排行榜：全面且無污染的程式碼大語言模型評估★ 75

Hugging Face 推出 ConTextual 排行榜：評估多模態模型在富含文本場景中的圖文聯合推理能力★ 75