Latest in AI

Showing:coding-agentsFoundersClear ×

🔥 Trending today

anthropic7 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-policy2 ai-regulation2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

AINews: Fable and Mythos Access Suspended Over Cybersecurity Risk★ 76
Latent SpaceyesterdayIncident
Anthropic’s Claude Fable 5 and Mythos 5 were abruptly suspended after a US export-control directive tied to a possible jailbreak and national cybersecurity risk. The roundup frames the event as a new “model sovereignty” warning for teams relying on closed frontier APIs. It also covers Kimi-K2.7-Code, MiniMax M3, DeepSWE replacing SWE-Bench Pro, agent-inference benchmarks, sandboxing, and Gemini-SQL2.
Program Claude Code, Codex, Pi and Other Agent Harnesses with AI SDK
Vercel Changelog2 days agoRelease
Vercel’s changelog entry says AI SDK can now be used to program agent harnesses including Claude Code, Codex, Pi, and other similar tools. Based on the title alone, the update appears aimed at developers who want a common programming interface around coding agents and AI assistant runtimes. No implementation details, APIs, examples, pricing, availability limits, or supported harness list beyond the named products are provided in the source text.
Introducing FrontierCode★ 78
Hacker News (AI keywords)5 days agoBenchmark
Cognition launched FrontierCode, a coding benchmark focused on mergeability rather than only functional correctness. It evaluates correctness, tests, scope discipline, style, and repository-specific quality standards. Built with open-source maintainers and extensive quality control, it shows current frontier models still struggle: Claude Opus 4.8 scores 13.4% on the hardest Diamond subset, ahead of GPT-5.5 and Gemini 3.1 Pro.
Remote agents in Vibe, powered by Mistral Medium 3.5★ 78
Mistral AI News6 days agoNew Tool
Mistral Medium 3.5 is a 128B dense model in public preview, combining instruction-following, reasoning, and coding with a 256k context window. It becomes the default model for Le Chat and Mistral Vibe. Vibe now supports remote coding agents that run asynchronously in the cloud, while Le Chat adds Work mode for longer multi-step tasks across connected tools.
Introducing Mistral Small 4★ 78
Mistral AI News6 days agoRelease
Mistral Small 4 is the next major release in the Mistral Small family, unifying Magistral-style reasoning, Pixtral-style multimodality, and Devstral-style coding agents. It uses a MoE architecture with 119B total parameters, 6B active parameters per token, a 256k context window, and configurable reasoning effort. The model is available via Mistral API, AI Studio, Hugging Face, open-source serving stacks, and NVIDIA deployment options.
Uber Caps Usage of AI Tools Like Claude Code to Manage Costs
Simon Willison's Weblog11 days agoBusiness
Uber has reportedly capped employee token spending at $1,500 per month for each agentic AI coding tool, including Cursor and Claude Code. Simon Willison frames this as a rational response to overspending, especially after earlier discussion that Uber exhausted its 2026 AI budget in four months. He estimates that two actively used tools would imply a $36,000 annual cap per engineer, about 11% of median US Uber software engineer compensation.
Show HN: Paseo - Beautiful open-source coding agent interface
Hacker News (AI keywords)11 days agoNew Tool
Paseo provides one interface for tools such as Claude Code, Codex, Copilot, OpenCode, and Pi. It runs agents through a local daemon on the user's own machine and supports desktop, mobile, web, and CLI clients. Its appeal is multi-agent orchestration and cross-device control, though real adoption depends on workflow fit, security, and reliability.
GitHub's Plan for Agents — Kyle Daigle, GitHub
Latent Space12 days agoBusiness
GitHub helped pioneer modern AI coding with Copilot, accelerating the adoption of AI-assisted development. The subsequent rise of agentic coding has placed notable strain on the widely used developer platform. Kyle Daigle of GitHub discusses the company's plan for responding to this shift, although the provided excerpt does not specify products, features, or timelines.
The solution might be cancelling my AI subscription
Simon Willison's Weblog14 days agoCommentary
Simon Willison relates to David Wilson's reflection on launching more than 16 projects with AI tooling. A request for a quick Claude script can expand into an hour-long project without solving the original problem. Coding agents may produce tested, documented solutions rapidly, but people can maintain only so many projects. The critical skill may be discipline: deciding which ideas deserve continued attention.
The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
Latent Space16 days agoCommentary
Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
I think Anthropic and OpenAI have found product-market fit★ 76
Simon Willison's Weblog18 days agoCommentary
Simon Willison says Claude Code/Cowork and OpenAI Codex have changed the economics of frontier AI. Personal subscriptions can still be bargains for heavy users, but enterprise plans are increasingly priced like API token usage. His core claim is that coding agents burn far more tokens, yet deliver enough value to high-paid knowledge workers that companies will pay materially more.
How Conductor moved parallel coding agents from the laptop to the cloud with Vercel Sandbox
Vercel Changelog18 days agoBusiness
Based on the title, the article describes Conductor shifting parallel coding-agent execution from developers’ laptops to Vercel Sandbox in the cloud. The likely focus is cloud isolation, parallel agent workflows, and reducing dependence on local machine resources. The full article text was not provided, so implementation details, metrics, model choices, and concrete results cannot be confirmed.
Launch HN: Runtime (YC P26) – Sandboxed coding agents for every team
Hacker News (AI keywords)24 days agoNew Tool
Runtime is a YC P26 launch focused on making coding agents usable across an organization, not only by engineers. It provides sandboxed environments with company context, integrations, secrets, policies, observability, and cost controls. The product page says it works with tools including Claude Code, Cursor, Codex, Copilot, Gemini CLI, Devin, and OpenCode, while fitting into Slack, Linear, GitHub, and related workflows.