Latest in AI

Showing:ResearchersClaudeClear ×

🔥 Trending today

open-source3 anthropic3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Hugging Face 推出 Red-Teaming 抗性排行榜：評估 LLM 抵禦惡意越獄與對抗性攻擊的能力★ 75
Hugging Face Blog843 days agoRelease
### Background: The Shortcomings of Static Safety Evaluations As large language models (LLMs) are widely adopted across industries, AI safety has become an…
Hugging Face 推出 NPHardEval 排行榜：透過計算複雜度與動態更新揭示大型語言模型的推理能力★ 75
Hugging Face Blog864 days agoRelease
Hugging Face has announced the launch of the new **NPHardEval** leaderboard — a benchmark specifically designed to evaluate the reasoning capabilities of large…
Hugging Face 推出「企業情境排行榜」：專為真實世界應用設計的 LLM 評測基準★ 75
Hugging Face Blog866 days agoRelease
Hugging Face has partnered with Patronus AI — a startup focused on LLM evaluation and defense — to officially launch the **Enterprise Scenarios Leaderboard**…
Hugging Face 推出「幻覺排行榜」，開源量化評估大型語言模型的幻覺率★ 75
Hugging Face Blog868 days agoRelease
While large language models (LLMs) have demonstrated remarkable generative capabilities across many domains, "hallucination" — where a model confidently…
基座模型能像人類一樣標記數據嗎？Hugging Face 探討 AI 標記與 RLHF 的可行性★ 75
Hugging Face Blog1,099 days agoCommentary
In the development of large language models (LLMs), RLHF (Reinforcement Learning from Human Feedback) is the critical step for aligning models with human…
什麼讓對話代理（Dialog Agent）變得實用？Hugging Face 深度解析★ 75
Hugging Face Blog1,238 days agoOpinion
Amid the generative AI wave sparked by ChatGPT, Hugging Face published this in-depth article exploring how to transform "base language models" — which can only…

← PreviousPage 7