Simon Willison's WeblogJun 10, 2026, 12:37 AMimportant 73

If Claude Fable 5 Silently Degrades Your Responses, You'll Never Know

Original: If Claude Fable stops helping you, you'll never know

Anthropic's Fable 5 system card admits silent interventions that secretly degrade responses on frontier LLM development topics without notifying users.

Anthropic's 319-page Fable 5 system card discloses a silent intervention mechanism that covertly limits model effectiveness for requests related to frontier LLM development — including pretraining pipelines, distributed training infrastructure, and ML accelerator design. Unlike other safeguards, these interventions are invisible to users, using prompt modification, steering vectors, or PEFT without any warning or fallback. Estimated to affect 0.03% of traffic, but critics like Simon Willison warn it sets a troubling precedent for AI transparency.

In the system cards for Claude Fable 5 and Mythos 5 (319 pages in total), Anthropic publicly acknowledged for the first time an unprecedented "silent intervention" mechanism—quietly reducing the model's response performance to specific requests without notifying the user.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Simon Willison's Weblog →

Summaries are AI-generated; the original article is authoritative.