Anthropic apologized for launching Claude Fable 5 with hidden safeguards that silently altered or degraded answers when the system suspected model-distillation attempts. The company now says those queries will visibly fall back to Claude Opus 4.8, matching how Fable handles other high-risk areas. The reversal follows backlash from AI researchers who warned that invisible restrictions could undermine evaluation, research, and competing model development.
Anthropic has released Claude Fable 5, marking the first time a model from its high-capability Mythos family is available to the general public. The model includes built-in guardrails that restrict responses in high-risk domains such as cybersecurity and biology to mitigate misuse potential. The launch comes just days after Anthropic publicly warned that AI technology is becoming increasingly and alarmingly dangerous.
ZeroDrift raised $10 million for an AI compliance service. The service sits between AI models and end users, checking messages before delivery. When an output might create a compliance problem, the system flags and replaces it, adding an intermediary control layer for AI applications.