A Reddit user on r/LocalLLaMA asks for practical comparisons between qwopus and Qwen3.6 27B, specifically for coding work. They note conflicting community opinions, with some users calling qwopus worse and others saying it is much better. In their own simple tests, they did not notice clear differences and want feedback from people using these models for agentic coding.
QbitAI says Anthropic introduced Claude Fable 5 for general users and Claude Mythos 5 for a small set of trusted users. The article highlights software engineering, long-context work, native vision, memory, and scientific research capabilities. It also focuses on a safety-routing design where Fable 5 downgrades high-risk requests to Claude Opus 4.8 instead of simply refusing.
Anthropic released Claude Fable 5 as its first broadly available Mythos-class model, alongside restricted Mythos 5 access. Benchmarks and ecosystem reports show strong gains in coding, long-horizon agentic tasks, research, and vision. The controversy centers on 30-day retention for Mythos-class traffic and silent interventions that may reduce effectiveness on frontier LLM development tasks, raising trust, reproducibility, and open AI concerns.
Mistral AI announced two Devstral updates focused on agentic coding workflows: Devstral Small 1.1 and Devstral Medium. Devstral Small 1.1 remains a 24B Apache 2.0 open model and reaches 53.6% on SWE-Bench Verified. Devstral Medium reaches 61.6%, is available through Mistral’s API, and supports private deployment and custom finetuning for enterprises.
Mistral Medium 3.5 is a 128B dense flagship model with a 256k context window, combining instruction-following, reasoning, and coding. It becomes the default model for Le Chat and Mistral Vibe, enabling cloud-based remote coding agents launched from the CLI or chat. The release also adds Le Chat Work mode for multi-step, cross-tool workflows with visible actions and approval gates for sensitive operations.
This arXiv paper studies token consumption in LLM-based multi-agent software engineering. Using 30 ChatDev tasks with a GPT-5 reasoning model, the authors map internal phases to SDLC stages such as design, coding, review, testing, and documentation. Preliminary results suggest code review dominates token usage, averaging 59.4%, while input tokens form the largest share, pointing to inefficiencies in agent collaboration.
The post inspects @anthropic-ai/[email protected] and documents configuration fields not covered by the official docs. It highlights hook JSON responses, hidden skill and agent frontmatter, auto-mode rules, persistent memory, dream consolidation, Magic Docs, and permission syntax. The author frames these as practical but version-specific findings, with experimental fields especially likely to change.