Hugging Face 推出 NPHardEval 排行榜:透過計算複雜度與動態更新揭示大型語言模型的推理能力★ 75
Hugging Face Blog·863 days ago·Release
Hugging Face has announced the launch of the new **NPHardEval** leaderboard — a benchmark specifically designed to evaluate the reasoning capabilities of large…