Tokenmaxxing Is Dead, Long Live Tokenmaxxing
What happened
An essay on 12gramsofcarbon responds to the OpenAI/Anthropic narrative that customers are shifting 'from tokenmaxxing to efficiency' (CNBC, June 26) and argues it's an illusion.
Context and impact
The author argues agentic runtimes like Claude Code, Codex, and Cursor rationalize per-call cost but multiply call counts via planning, retries, evaluation, and verification. The resulting bill is higher, just packaged in higher abstraction ('per-task pricing'). The essay resonates with ML engineers building FinOps processes for AI workloads.
Details
- 104 points on HN, top weekend thread
- Responds to CNBC piece on 'efficiency shift'
- Argument: token usage is now per-task, not per-call
- Implication: enterprise FinOps must measure task-level cost
Open original source
12gramsofcarbon