Route the expensive model where it earns its keep

announcement

How we got here

This release started as a question, not a feature: *"what's my workflow on the new frontier model, and where could cheaper models do the work?"* The first attempt to answer it with ax failed - and the failure was the roadmap.

Subagent token-usage rows all had `model = null`. The transcripts carried the model on every message and the extractor read it correctly; two ingest bugs erased it. The session upsert in `derive-claude-subagents` dropped the extracted model, and the session-health pass clobbered the correct value the transcript-priced writer had just written - `model_ref` survived on the same rows, which is what proved the clobber (#300). One backfill later (`AX_REDERIVE_SUBAGENTS=1 ax ingest --stages=subagents`), 1,500+ rows were correctly attributed, and the answer to the original question became one command.

The same day, the wider conversation caught up - "do the auto-routing under the hood so users don't cap out their limits" was the live critique of frontier-model packaging. This release is that, user-side and measured: the expensive model keeps judgment, planning, and review; mechanical dispatches carry an explicit cheaper model; and the graph proves whether the routing actually happened.

What changed

Cost visibility (#301):

ax cost split --days=7      # main loop vs subagents, by model
ax cost models              # per-model rollup
ax cost sessions            # top sessions by cost

The split is the headline view: it shows the main loop's cache reads dominating spend, and the share of subagent work inherited by the expensive model. On the machine that built this, 77% of dispatches inherited the frontier model while doing mechanical work.

Dispatch routing (#303, #302, #304):

ax dispatches --candidates   # model-less dispatches + identified redirectable spend
ax dispatches compile-routing  # regenerate ~/.ax/hooks/routing-table.json
ax hooks install ~/.ax/hooks/route-dispatch.ts --providers=claude

The `route-dispatch` hook warns at dispatch time when a mechanical dispatch forgets an explicit model - backtested against 396 real dispatches before install (156 would-warn, 0 would-block). Policy is deliberate: quality reviews and PR reviews stay on the main model; only spec reviews, searches, research, well-specified implementation, and bulk-mechanical work route down.

The self-tuning loop (#310, #307, #306): the routing table is not a file anyone promises to maintain. `/routing-tune` - a committed Claude Code workflow - mines dispatch history for unmatched expensive work, adversarially backtests every proposed class against judgment-work false positives, and emits a review brief. Its first live run found $1,127 of unmatched spend, proposed 5 classes, and 3 survived review; same-window detector coverage went from $404 to $573 over 30 days (+42%). Those totals are *identified redirectable spend* - a retrospective repricing, not realized savings; realized savings show up go-forward as the inherit rate drops. The `efficient-dispatch` skill (via `npx skills add Necmttn/ax`) teaches the orchestration pattern, with its routing table generated from the same constant the hook reads - a CI test keeps all copies byte-identical. When missed savings accumulate, the improve loop surfaces a routing proposal on its own.

Studio Story tab (#309, #235): shared sessions now render a narration-grounded review surface - files-touched tree, three-pane review, and the Story tab over share manifest v5.

Ingest resilience (#295, #294): per-file failure isolation now covers Pi, OpenCode, and Cursor - one corrupt transcript skips that file instead of failing the stage, and the dashboard Live tab lists what was skipped and why.

CLI internals (#293, #291, #289): the command manifest now owns visibility for every command family, shared `fail()`/flag helpers replaced ad-hoc exit sites, and query tests share one `makeMockDb`.

Discovery for LLMs: the site now serves `/llms.txt` - a machine-readable index of everything ax does, including 25 "how do I..." Q&A entries (one-hop commands and the multi-hop graph questions like fragility cascades). A CI gate fails any new CLI subcommand that isn't documented there.

Why it matters

Frontier models are worth paying for judgment - and exactly that. Every mechanical dispatch they inherit is rate-limit headroom and money spent on work a cheaper model does fine. This release makes that leak visible (`ax cost split`), names it per-dispatch (`ax dispatches --candidates`), nudges it at the moment of the mistake (route-dispatch hook), and re-derives the routing policy from your own history (`/routing-tune`) - with the graph, not vibes, deciding whether any of it worked.

referenced changes

20 linked changes

Generated from Release Please and grouped by change type. Each row keeps its issue and commit references.

Features 13 Bug Fixes 6 Performance 1

Features

01cli: ax cost models/sessions/split + MCP cost tools (#301) (a77a082)
02cli: ax dispatches + --candidates + compile-routing (#303) (8d11f5f)
03cli: ax sessions churn - verification churn insight (#308) (f75036d)
04cli: manifest owns command visibility - uniform registration loop (#293) (6d1dabc)
05dashboard: sessions list remakeover - enriched rows + insight accordion (#299) (c73dd1f)
06dashboard: surface skipped-file failures from ingest in the Live tab (#294) (d3818ef), closes #262
07hooks-sdk: route-dispatch hook - model-routing suggestions on Agent dispatches (#302) (0750963)
08improve: routing proposal rule in the proposals derive stage (#306) (739258c)
09ingest: extend per-file failure isolation to Pi, OpenCode, and Cursor (#295) (b1cf97c), closes #261
10routing: apply /routing-tune mined classes - task-N-impl, bug-fix, feature-add (#310) (009103d)
11skills: efficient-dispatch - measured model-routing orchestration (#307) (d999d2e)
12studio: files-touched tree, review view, and session narration (#235) (ec464f3)
13studio: Story tab as narration-grounded review surface (share manifest v5) (#309) (bb943aa)

Bug Fixes

01cli: route share/star through family manifests, drop dispatch bypass (#286) (f0fe014), closes #242
02dashboard: short-circuit empty-q /api/recall before AppLayer (#284) (683596a), closes #245
03hooks-sdk: align route-dispatch defaults with tuned ROUTING_CLASSES (#304) (c1666bc)
04hooks-sdk: honor worktree guard bypasses (0be68e1)
05ingest: subagent model attribution + spawned edge dispatch metadata (#300) (91e86d8)
06inspect: bound and chunk the direct block/atom ref fan-out (#290) (ecdcbf2), closes #263

Performance

01schema: add skill_paired endpoint indexes (#285) (fc5c5c5), closes #246

generated by release please

Show generated changelog for v0.26.0commit-level detail from Release Please

0.26.0 (2026-06-12)

Features

cli: ax cost models/sessions/split + MCP cost tools (#301) (a77a082)
cli: ax dispatches + --candidates + compile-routing (#303) (8d11f5f)
cli: ax sessions churn - verification churn insight (#308) (f75036d)
cli: manifest owns command visibility - uniform registration loop (#293) (6d1dabc)
dashboard: sessions list remakeover - enriched rows + insight accordion (#299) (c73dd1f)
dashboard: surface skipped-file failures from ingest in the Live tab (#294) (d3818ef), closes #262
hooks-sdk: route-dispatch hook - model-routing suggestions on Agent dispatches (#302) (0750963)
improve: routing proposal rule in the proposals derive stage (#306) (739258c)
ingest: extend per-file failure isolation to Pi, OpenCode, and Cursor (#295) (b1cf97c), closes #261
routing: apply /routing-tune mined classes - task-N-impl, bug-fix, feature-add (#310) (009103d)
skills: efficient-dispatch - measured model-routing orchestration (#307) (d999d2e)
studio: files-touched tree, review view, and session narration (#235) (ec464f3)
studio: Story tab as narration-grounded review surface (share manifest v5) (#309) (bb943aa)

Bug Fixes

cli: route share/star through family manifests, drop dispatch bypass (#286) (f0fe014), closes #242
dashboard: short-circuit empty-q /api/recall before AppLayer (#284) (683596a), closes #245
hooks-sdk: align route-dispatch defaults with tuned ROUTING_CLASSES (#304) (c1666bc)
hooks-sdk: honor worktree guard bypasses (0be68e1)
ingest: subagent model attribution + spawned edge dispatch metadata (#300) (91e86d8)
inspect: bound and chunk the direct block/atom ref fan-out (#290) (ecdcbf2), closes #263

Performance

schema: add skill_paired endpoint indexes (#285) (fc5c5c5), closes #246