Hugging Face 推出 Red-Teaming 抗性排行榜:評估 LLM 抵禦惡意越獄與對抗性攻擊的能力★ 75
Hugging Face Blog·842 days ago·Release
### Background: The Shortcomings of Static Safety Evaluations As large language models (LLMs) are widely adopted across industries, AI safety has become an…