Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
This Show HN submission points to “Continue? Y/N,” a 60-second game about AI agent permission fatigue. With no article body provided, the available information suggests an interactive commentary on how repeated approval prompts can wear users down. The project appears most relevant to developers, designers, and product teams thinking about agent UX, consent flows, and trust boundaries.
Artificial Analysis and IBM present ITBench-AA, described in the title as the first benchmark for agentic enterprise IT tasks. The headline result is that frontier models score below 50%, suggesting current systems still struggle with enterprise-grade agent workflows. The original article text is unavailable here, so task design, evaluated models, scoring methodology, and rankings cannot be confirmed.
INSIDE frames enterprise AI through a sharp ROI gap: a 2025 MIT survey said 95% of companies had not seen returns despite massive AI spending. It also cites Gartner’s forecast that Fortune 500 companies may average 150,000 agents by 2028. The article focuses on Google Cloud’s view of how enterprises should prepare for AI agents and allocate IT budgets for real deployment.