Tokenomics: Quantifying Where Tokens Are Used in Agentic Software Engineering

A paper measures where tokens are spent in LLM multi-agent software engineering workflows.

This arXiv paper studies token consumption in LLM-based multi-agent software engineering. Using 30 ChatDev tasks with a GPT-5 reasoning model, the authors map internal phases to SDLC stages such as design, coding, review, testing, and documentation. Preliminary results suggest code review dominates token usage, averaging 59.4%, while input tokens form the largest share, pointing to inefficiencies in agent collaboration.

This paper focuses on the actual operating costs of "agentic software engineering": when LLM multi-agent systems are used to automatically complete tasks such as requirements, code generation, testing, and documentation, where exactly are the tokens spent? The authors point out that although such systems are increasingly viewed as candidate solutions for automating software development, their resource consumption, cost predictability, and environmental impact still lack clear quantification, which hinders practical adoption.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Summaries are AI-generated; the original article is authoritative.