How we got here
This release started as a question, not a feature: *"what's my workflow on the new frontier model, and where could cheaper models do the work?"* The first attempt to answer it with ax failed - and the failure was the roadmap.
Subagent token-usage rows all had `model = null`. The transcripts carried the model on every message and the extractor read it correctly; two ingest bugs erased it. The session upsert in `derive-claude-subagents` dropped the extracted model, and the session-health pass clobbered the correct value the transcript-priced writer had just written - `model_ref` survived on the same rows, which is what proved the clobber (#300). One backfill later (`AX_REDERIVE_SUBAGENTS=1 ax ingest --stages=subagents`), 1,500+ rows were correctly attributed, and the answer to the original question became one command.
The same day, the wider conversation caught up - "do the auto-routing under the hood so users don't cap out their limits" was the live critique of frontier-model packaging. This release is that, user-side and measured: the expensive model keeps judgment, planning, and review; mechanical dispatches carry an explicit cheaper model; and the graph proves whether the routing actually happened.
What changed
Cost visibility (#301):
ax cost split --days=7 # main loop vs subagents, by model
ax cost models # per-model rollup
ax cost sessions # top sessions by costThe split is the headline view: it shows the main loop's cache reads dominating spend, and the share of subagent work inherited by the expensive model. On the machine that built this, 77% of dispatches inherited the frontier model while doing mechanical work.
Dispatch routing (#303, #302, #304):
ax dispatches --candidates # model-less dispatches + identified redirectable spend
ax dispatches compile-routing # regenerate ~/.ax/hooks/routing-table.json
ax hooks install ~/.ax/hooks/route-dispatch.ts --providers=claudeThe `route-dispatch` hook warns at dispatch time when a mechanical dispatch forgets an explicit model - backtested against 396 real dispatches before install (156 would-warn, 0 would-block). Policy is deliberate: quality reviews and PR reviews stay on the main model; only spec reviews, searches, research, well-specified implementation, and bulk-mechanical work route down.
The self-tuning loop (#310, #307, #306): the routing table is not a file anyone promises to maintain. `/routing-tune` - a committed Claude Code workflow - mines dispatch history for unmatched expensive work, adversarially backtests every proposed class against judgment-work false positives, and emits a review brief. Its first live run found $1,127 of unmatched spend, proposed 5 classes, and 3 survived review; same-window detector coverage went from $404 to $573 over 30 days (+42%). Those totals are *identified redirectable spend* - a retrospective repricing, not realized savings; realized savings show up go-forward as the inherit rate drops. The `efficient-dispatch` skill (via `npx skills add Necmttn/ax`) teaches the orchestration pattern, with its routing table generated from the same constant the hook reads - a CI test keeps all copies byte-identical. When missed savings accumulate, the improve loop surfaces a routing proposal on its own.
Studio Story tab (#309, #235): shared sessions now render a narration-grounded review surface - files-touched tree, three-pane review, and the Story tab over share manifest v5.
Ingest resilience (#295, #294): per-file failure isolation now covers Pi, OpenCode, and Cursor - one corrupt transcript skips that file instead of failing the stage, and the dashboard Live tab lists what was skipped and why.
CLI internals (#293, #291, #289): the command manifest now owns visibility for every command family, shared `fail()`/flag helpers replaced ad-hoc exit sites, and query tests share one `makeMockDb`.
Discovery for LLMs: the site now serves `/llms.txt` - a machine-readable index of everything ax does, including 25 "how do I..." Q&A entries (one-hop commands and the multi-hop graph questions like fragility cascades). A CI gate fails any new CLI subcommand that isn't documented there.
Why it matters
Frontier models are worth paying for judgment - and exactly that. Every mechanical dispatch they inherit is rate-limit headroom and money spent on work a cheaper model does fine. This release makes that leak visible (`ax cost split`), names it per-dispatch (`ax dispatches --candidates`), nudges it at the moment of the mistake (route-dispatch hook), and re-derives the routing policy from your own history (`/routing-tune`) - with the graph, not vibes, deciding whether any of it worked.