Latest in AI

Showing:ResearchersGPTClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

2025 年末 AI 實用指南：Ethan Mollick 的主觀使用建議★ 85
One Useful Thing (Mollick)238 days agoTutorial
Wharton School professor Ethan Mollick has put together a highly personal and practical operating guide for the AI landscape of late 2025. He emphasizes that…
在 ChatGPT 中運行 Next.js：深入解析原生應用整合技術★ 85
Vercel Changelog242 days agoRelease
At a time when AI-assisted development is rapidly evolving, Vercel has published a deep technical breakdown exploring how to natively integrate its React…
與魔法師共事：在參差不齊的技術前沿驗證 AI 的魔力★ 85
One Useful Thing (Mollick)276 days agoOpinion
University of Pennsylvania Wharton School professor Ethan Mollick, in his latest article, compares the experience of collaborating with generative AI (such as…
你可以直接用在 Transformers 的 OpenAI gpt-oss 加速妙招 🫵★ 82
Hugging Face Blog276 days agoTutorial
### Background and the LLM Inference Bottleneck When running large language models (LLMs), autoregressive generation is inherently "memory-bandwidth-bound"…
大眾智能（Mass Intelligence）：從 GPT-5 到邊緣小模型，強大 AI 正在走向普及化★ 85
One Useful Thing (Mollick)290 days agoOpinion
In this article exploring "Mass Intelligence," University of Pennsylvania Wharton School professor Ethan Mollick reveals an imminent future: high-level…
TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75
Hugging Face Blog306 days agoRelease
Hugging Face has recently introduced a new benchmark called "TextQuests," designed to evaluate the performance of large language models (LLMs) in text-based…
GPT-5：它就是能搞定一切（讓 AI 掌管任務的新時代）★ 85
One Useful Thing (Mollick)311 days agoOpinion
Renowned AI scholar and Wharton School professor Ethan Mollick published a forward-looking observation about GPT-5 on his blog "One Useful Thing," titled…
GPT-5、GPT-5-mini 與 GPT-5-nano 現已支援 Vercel AI Gateway★ 95
Vercel Changelog311 days agoRelease
Vercel announced in its official Changelog that OpenAI's latest generation flagship model GPT-5, along with its lightweight version GPT-5-mini and…
歡迎 GPT OSS！OpenAI 全新開源模型家族正式登陸 Hugging Face★ 95
Hugging Face Blog313 days agoRelease
The Hugging Face official blog has announced exciting news, formally welcoming OpenAI's newly launched open-source model family — "GPT OSS." This is undeniably…
TimeScope：評估影片大型多模態模型（Video LMM）長影片理解極限的新基準★ 75
Hugging Face Blog326 days agoRelease
As large multimodal models (LMMs) have achieved breakthroughs in image and short-video understanding, the industry has gradually shifted its attention to the…
回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75
Hugging Face Blog332 days agoRelease
### What is FutureBench? As large language models (LLMs) and AI agents have rapidly advanced, traditional static benchmarks (such as MMLU and GSM8K) face a…
ScreenEnv：部署你的全端桌面 AI 代理（Desktop Agent）環境★ 82
Hugging Face Blog339 days agoNew Tool
With the rise of Anthropic's Claude 3.5 Sonnet "Computer Use" and various GUI-oriented multimodal models, "desktop agents" have become one of the hottest areas…
立即上手 AI：實用快速指南 (Ethan Mollick 著)★ 85
One Useful Thing (Mollick)356 days agoTutorial
University of Pennsylvania Wharton School professor Ethan Mollick recently published an extremely practical AI quick guide, "Using AI Right Now: A Quick…
在 Replicate 上運行 OpenAI 最新模型：支援 GPT-4.1、GPT-4o 與 o 系列
Replicate Blog388 days agoNew Tool
Replicate, the well-known AI model hosting and deployment platform, has announced a major update: it now officially supports OpenAI's latest-generation models…
性格與說服力：從 AI 的「迎合效應」中學習★ 75
One Useful Thing (Mollick)409 days agoOpinion
Wharton School professor Ethan Mollick, in his latest article "Personality and Persuasion," delves into AI's persuasive power and the psychological mechanisms…
OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90
TLDR AI (Buttondown)423 days agoRelease
OpenAI recently held a live stream and published a blog post to officially announce the new reasoning model o3 and the lightweight reasoning model o4-mini…
介紹 HELMET：全面評估長文本語言模型（Long-context LLMs）的新一代基準測試★ 80
Hugging Face Blog424 days agoRelease
### Background and Pain Points: Moving Beyond the Overly Simple "Needle in a Haystack" Test In recent years, the context window length supported by large…
OpenAI 推出全新主力模型 GPT 4.1：效能與實用性的新平衡★ 85
TLDR AI (Buttondown)425 days agoRelease
OpenAI has officially released its new flagship model GPT 4.1, positioned as the next-generation "workhorse" designed to give developers and enterprises the…
【AINews】平靜中的暗潮：OpenAI 記憶更新與 GPT-4.1 洩露、xAI 正式推出 Grok 3 API★ 75
TLDR AI (Buttondown)429 days agoRelease
Although AINews characterized these two days as "a calm day," in reality, tech giants and the open-source community remained full of undercurrents. First, on…
Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75
Hugging Face Blog495 days agoRelease
As large language model (LLM) technology has evolved, AI has transformed from a simple question-answering assistant into an "AI agent" capable of proactively…
Hugging Face 輕量級 Agent 框架 smolagents 正式支援視覺語言模型 (VLM)！★ 80
Hugging Face Blog506 days agoRelease
On January 24, 2025, Hugging Face announced that smolagents — its open-source library designed for building lightweight, high-performance AI agents — now…
Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85
Hugging Face Blog530 days agoRelease
Hugging Face officially launched a lightweight AI agent development framework called `smolagents` at the end of 2024. The core philosophy of this tool is "Code…
重新思考阿拉伯語大模型評估：AraGen 基準測試與 3C3H 評估框架上線 Hugging Face
Hugging Face Blog557 days agoRelease
### Background and Challenges: The Difficulty of Evaluating Non-English LLMs In the current landscape of large language model (LLM) development, evaluating…
讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75
Hugging Face Blog571 days agoRelease
This article from the Hugging Face blog introduces "The First Multilingual LLM Debate Competition." As large language models (LLMs) have rapidly advanced…
Hugging Face 與 Atla 推出「Judge Arena」：評估 LLM 作為裁判能力的全新基準測試★ 80
Hugging Face Blog572 days agoRelease
As large language models (LLMs) have rapidly advanced, traditional static benchmarks (such as MMLU) have increasingly faced saturation and gaming problems. As…
專家支援案例研究：利用 LLM-as-a-Judge 評估機制強化 Digital Green 的 RAG 農業問答應用★ 75
Hugging Face Blog594 days agoTutorial
This case study provides a detailed account of how non-profit organization Digital Green, with support from Hugging Face's Expert Support team, optimized its…
評估驅動開發（Eval-driven development）：更快打造更好的 AI 應用★ 80
Vercel Changelog605 days agoOpinion
As generative AI applications become more widespread, one of the biggest challenges developers face is the "non-deterministic" output of large language models…
LAVE：在 Docmatix 上使用 LLM 進行零樣本 VQA 評估——我們還需要微調嗎？★ 75
Hugging Face Blog689 days agoPaper
### Background and Challenges Document Visual Question Answering (DocVQA) is an important application of multimodal AI, requiring models to simultaneously…
Hugging Face 的 Transformers Code Agent 刷新 GAIA 基準測試紀錄 🏅★ 80
Hugging Face Blog713 days agoRelease
The Hugging Face team published a blog post announcing that their Code Agent, developed using the `transformers` library, achieved a breakthrough score on the…
BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80
Hugging Face Blog726 days agoRelease
As large language models (LLMs) have made tremendous strides in code generation, the long-standing industry gold standard — the HumanEval benchmark — has…

← PreviousPage 4Next →

Latest in AI

2025 年末 AI 實用指南：Ethan Mollick 的主觀使用建議★ 85

在 ChatGPT 中運行 Next.js：深入解析原生應用整合技術★ 85

與魔法師共事：在參差不齊的技術前沿驗證 AI 的魔力★ 85

你可以直接用在 Transformers 的 OpenAI gpt-oss 加速妙招 🫵★ 82

大眾智能（Mass Intelligence）：從 GPT-5 到邊緣小模型，強大 AI 正在走向普及化★ 85

TextQuests：LLM 在文字冒險遊戲中的表現究竟如何？Hugging Face 推出全新評估基準★ 75

GPT-5：它就是能搞定一切（讓 AI 掌管任務的新時代）★ 85

GPT-5、GPT-5-mini 與 GPT-5-nano 現已支援 Vercel AI Gateway★ 95

歡迎 GPT OSS！OpenAI 全新開源模型家族正式登陸 Hugging Face★ 95

TimeScope：評估影片大型多模態模型（Video LMM）長影片理解極限的新基準★ 75

回到未來：Hugging Face 推出 FutureBench 評估 AI Agent 的未來事件預測能力★ 75

ScreenEnv：部署你的全端桌面 AI 代理（Desktop Agent）環境★ 82

立即上手 AI：實用快速指南 (Ethan Mollick 著)★ 85

在 Replicate 上運行 OpenAI 最新模型：支援 GPT-4.1、GPT-4o 與 o 系列

性格與說服力：從 AI 的「迎合效應」中學習★ 75

OpenAI 發表 o3、o4-mini 推理模型與開源終端機工具 Codex CLI★ 90

介紹 HELMET：全面評估長文本語言模型（Long-context LLMs）的新一代基準測試★ 80

OpenAI 推出全新主力模型 GPT 4.1：效能與實用性的新平衡★ 85

【AINews】平靜中的暗潮：OpenAI 記憶更新與 GPT-4.1 洩露、xAI 正式推出 Grok 3 API★ 75

Hugging Face 推出 DABStep：評估數據代理多步驟推理能力的全新基準測試★ 75

Hugging Face 輕量級 Agent 框架 smolagents 正式支援視覺語言模型 (VLM)！★ 80

Hugging Face 推出 smolagents：用 Python 程式碼撰寫行動的極簡 AI Agent 框架★ 85

重新思考阿拉伯語大模型評估：AraGen 基準測試與 3C3H 評估框架上線 Hugging Face

讓大型模型展開辯論：首屆多語言 LLM 辯論賽★ 75

Hugging Face 與 Atla 推出「Judge Arena」：評估 LLM 作為裁判能力的全新基準測試★ 80

專家支援案例研究：利用 LLM-as-a-Judge 評估機制強化 Digital Green 的 RAG 農業問答應用★ 75

評估驅動開發（Eval-driven development）：更快打造更好的 AI 應用★ 80

LAVE：在 Docmatix 上使用 LLM 進行零樣本 VQA 評估——我們還需要微調嗎？★ 75

Hugging Face 的 Transformers Code Agent 刷新 GAIA 基準測試紀錄 🏅★ 80

BigCodeBench：下一代 Code LLM 評測基準 HumanEval 的繼承者★ 80