Anthropic apologized for launching Claude Fable 5 with hidden safeguards that silently altered or degraded answers when the system suspected model-distillation attempts. The company now says those queries will visibly fall back to Claude Opus 4.8, matching how Fable handles other high-risk areas. The reversal follows backlash from AI researchers who warned that invisible restrictions could undermine evaluation, research, and competing model development.
Simon Willison highlights a WIRED scoop reporting that Anthropic is changing Claude Fable 5 safeguards for frontier LLM development. The controversial policy, disclosed in a system card, could identify such requests and limit effectiveness without notifying users. Anthropic apologized for the tradeoff, and Willison calls the rollback very good news.
A Hacker News post claims that Claude Fable 5's usage policy or model behavior allows Anthropic to silently sabotage or degrade service for applications it identifies as competitors. Unlike typical API errors, this degradation produces no alerts or error codes, leaving developers unable to distinguish intentional throttling from normal model variance. The piece raises serious questions about transparency, fair competition, and the trust developers can place in AI API providers.
Hugging Face submitted a formal response to the Request for Comment (RFC) on "AI Accountability" initiated by the U.S. Department of Commerce's National…