Latest in AI

Showing:agentic-codingDevelopersClear ×

🔥 Trending today

anthropic7 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-policy2 ai-regulation2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

How Useful Is qwopus Compared With Qwen3.6 27B for Coding?
r/LocalLLaMA top day4 days agoOpinion
A Reddit user on r/LocalLLaMA asks for practical comparisons between qwopus and Qwen3.6 27B, specifically for coding work. They note conflicting community opinions, with some users calling qwopus worse and others saying it is much better. In their own simple tests, they did not notice clear differences and want feedback from people using these models for agentic coding.
Claude Mythos 5 Released: 50 Million Lines of Code in One Day★ 74
量子位 QbitAI4 days agoRelease
QbitAI says Anthropic introduced Claude Fable 5 for general users and Claude Mythos 5 for a small set of trusted users. The article highlights software engineering, long-context work, native vision, memory, and scientific research capabilities. It also focuses on a safety-routing design where Fable 5 downgrades high-risk requests to Claude Opus 4.8 instead of simply refusing.
Claude Code One-Year Retrospective: Development Enters the Era of Agent Armies
INSIDE 硬塞 AI4 days agoCommentary
INSIDE summarizes Claude Code’s first-year reflections from its team, highlighting how agentic coding is changing software work. The article says bugs can be fixed before engineers act, Plan Mode has been overtaken by Auto Mode, and much work can happen on mobile. It also mentions Anthropic’s following-day Claude Fable 5 launch as a signal of the next stage in agent-heavy development.
Anthropic Claude Fable 5: Mythos-Class Power with Controversial Terms★ 84
Latent Space4 days agoRelease
Anthropic released Claude Fable 5 as its first broadly available Mythos-class model, alongside restricted Mythos 5 access. Benchmarks and ecosystem reports show strong gains in coding, long-horizon agentic tasks, research, and vision. The controversy centers on 30-day retention for Mythos-class traffic and silent interventions that may reduce effectiveness on frontier LLM development tasks, raising trust, reproducibility, and open AI concerns.
llm 0.32a3 alpha release, almost entirely written by Claude Fable 5
Simon Willison's Weblog4 days agoRelease
Simon Willison has published llm 0.32a3, an alpha release of his popular LLM CLI and Python library. The standout detail is that nearly all of the code was written by the new Claude Fable 5 model using Claude Code. Willison also posted a detailed write-up covering how he used Claude Code to add features to both his datasette agent and llm projects.
Introducing Mistral Code★ 72
Mistral AI News6 days agoNew Tool
Mistral AI introduced Mistral Code, an enterprise-focused AI coding assistant built on Continue and available in private beta for VSCode and JetBrains IDEs. It combines Codestral, Codestral Embed, Devstral, and Mistral Medium for autocomplete, retrieval, agentic coding, and chat. The product emphasizes secure deployment, customization, observability, RBAC, audit logging, and support for cloud, serverless, self-hosted, and air-gapped environments.
Upgrading agentic coding capabilities with the new Devstral models★ 72
Mistral AI News6 days agoRelease
Mistral AI announced two Devstral updates focused on agentic coding workflows: Devstral Small 1.1 and Devstral Medium. Devstral Small 1.1 remains a 24B Apache 2.0 open model and reaches 53.6% on SWE-Bench Verified. Devstral Medium reaches 61.6%, is available through Mistral’s API, and supports private deployment and custom finetuning for enterprises.
Remote agents in Vibe. Powered by Mistral Medium 3.5.★ 76
Mistral AI News6 days agoRelease
Mistral Medium 3.5 is a 128B dense flagship model with a 256k context window, combining instruction-following, reasoning, and coding. It becomes the default model for Le Chat and Mistral Vibe, enabling cloud-based remote coding agents launched from the CLI or chat. The release also adds Le Chat Work mode for multi-step, cross-tool workflows with visible actions and approval gates for sensitive operations.
Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them
Hacker News (AI keywords)7 days agoBusiness
The author uses a Claude Code coding experiment to estimate the API-equivalent cost of serious LLM coding. They argue simple chats are cheap, but complex reasoning and multi-file coding can burn large amounts of visible and hidden tokens. The piece is skeptical and estimate-driven, concluding that current $100/month plans may be heavily subsidized and economically fragile.
Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering
Hacker News (AI keywords)7 days agoPaper
This arXiv paper studies token consumption in LLM-based multi-agent software engineering. Using 30 ChatDev tasks with a GPT-5 reasoning model, the authors map internal phases to SDLC stages such as design, coding, review, testing, and documentation. Preliminary results suggest code review dominates token usage, averaging 59.4%, while input tokens form the largest share, pointing to inefficiencies in agent collaboration.
Harness engineering: Leveraging Codex in an agent-first world★ 76
Hacker News (AI keywords)9 days agoCommentary
OpenAI describes an internal experiment where Codex generated an entire product codebase from an empty repository. The post argues that engineers shift from writing code to designing environments, constraints, documentation, and feedback loops. Key practices include repo-local knowledge, mechanical architecture enforcement, agent-readable UI and observability, lightweight PR flow, and continuous cleanup.
Ask HN: What is your (AI) dev tech stack / workflow?
Hacker News (AI keywords)9 days agoCommentary
An Ask HN thread asks developers to share their current AI-assisted development setup for upcoming in-person workshops. The author wants guidance for beginners and working developers, with use cases ranging from static sites to FastAPI tools and Linux home automation. Replies cover Claude Code, Cursor, GitHub Copilot, VSCode, spec-driven development, TDD, multi-agent workflows, reviews, and quality control.
Claude Code Lead Boris Cherny Says His Code Is 100% AI-Written
INSIDE 硬塞 AI11 days agoCommentary
Claude Code lead Boris Cherny says his code is now 100% written by AI while he runs hundreds of agents in parallel. The article frames engineers less as manual coders and more as conductors who define problems, review outputs, and shape architecture. It highlights a broader shift in software development workflows driven by AI coding agents, without presenting detailed benchmarks or implementation data.
GitHub's Plan for Agents — Kyle Daigle, GitHub
Latent Space12 days agoBusiness
GitHub helped pioneer modern AI coding with Copilot, accelerating the adoption of AI-assisted development. The subsequent rise of agentic coding has placed notable strain on the widely used developer platform. Kyle Daigle of GitHub discusses the company's plan for responding to this shift, although the provided excerpt does not specify products, features, or timelines.
Claude Code: Undocumented Configuration Options from the Source
Hacker News (AI keywords)16 days agoTutorial
The post inspects @anthropic-ai/[email protected] and documents configuration fields not covered by the official docs. It highlights hook JSON responses, hidden skill and agent frontmatter, auto-mode rules, persistent memory, dream consolidation, Magic Docs, and permission syntax. The author frames these as practical but version-specific findings, with experimental fields especially likely to change.
Introducing Claude Opus 4.8★ 78
Hacker News (AI keywords)17 days agoRelease
Anthropic introduced Claude Opus 4.8 as an upgrade over Opus 4.7, emphasizing benchmark gains, sharper judgment, and more reliable agentic work. The launch also adds dynamic workflows in Claude Code, effort controls in claude.ai and Cowork, and Messages API support for system entries inside messages. Standard pricing remains unchanged, while fast mode is faster and substantially cheaper than before.
BenQ and MetaAge Adopt AWS Generative AI for Agentic Coding Productivity Gains
INSIDE 硬塞 AI18 days agoBusiness
BenQ is expanding AI across its education and business display ecosystem, including software products such as SummarAI and Meeting Room System. The article says BenQ partnered with MetaAge to adopt Amazon Web Services generative AI. Its main claim is a 20x productivity improvement through Agentic Coding, though the provided excerpt does not include implementation details or measurement methodology.

Latest in AI

How Useful Is qwopus Compared With Qwen3.6 27B for Coding?

Claude Mythos 5 Released: 50 Million Lines of Code in One Day★ 74

Claude Code One-Year Retrospective: Development Enters the Era of Agent Armies

Anthropic Claude Fable 5: Mythos-Class Power with Controversial Terms★ 84

llm 0.32a3 alpha release, almost entirely written by Claude Fable 5

Introducing Mistral Code★ 72

Upgrading agentic coding capabilities with the new Devstral models★ 72

Remote agents in Vibe. Powered by Mistral Medium 3.5.★ 76

Anthropic/OpenAI may be spending more than $1000 for every $100 you pay them

Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering

Harness engineering: Leveraging Codex in an agent-first world★ 76

Ask HN: What is your (AI) dev tech stack / workflow?

Claude Code Lead Boris Cherny Says His Code Is 100% AI-Written

GitHub's Plan for Agents — Kyle Daigle, GitHub

Claude Code: Undocumented Configuration Options from the Source

Introducing Claude Opus 4.8★ 78

BenQ and MetaAge Adopt AWS Generative AI for Agentic Coding Productivity Gains