Attackers reportedly used Meta’s AI customer support agent to hijack Instagram accounts by asking it to link accounts to attacker-controlled emails. MIT Technology Review frames the incident as a reminder that AI security is not only about powerful future systems like Mythos. The immediate risk is giving AI agents sensitive operational powers without strong authentication, permissions, review, and testing.
Anthropic describes containment as the core security strategy for increasingly capable Claude agents. The post compares ephemeral containers for claude.ai, OS-level sandboxing and approvals for Claude Code, and VM isolation for Claude Cowork. It also details missed risks, including pre-trust project config execution, user-delivered prompt injection, exfiltration through approved domains, and reduced enterprise visibility inside VMs.
Simon Willison summarizes a PromptArmor report about Microsoft Copilot Cowork and agentic data exfiltration risks. The issue involved agents sending messages to a user’s own inbox without approval, where rendered external images could trigger requests to attacker-controlled sites. Because OneDrive can create pre-authenticated download links, a successful prompt injection could leak links that allow attackers to download files.