story-kit: done 296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation

This commit is contained in:
Dave
2026-03-19 09:55:28 +00:00
parent 9cdb0d4ea8
commit f325ddf9fe

View File

@@ -0,0 +1,54 @@
---
name: "Track per-agent token usage for cost visibility and optimisation"
---
# Story 296: Track per-agent token usage for cost visibility and optimisation
## User Story
As a project owner, I want to see how many tokens each agent consumes per story, so that I can identify expensive operations and optimise token usage across the pipeline.
## Acceptance Criteria
- [ ] Implement per-agent token tracking that captures input tokens, output tokens, and cache tokens for each agent run
- [ ] Token usage is recorded per story and per agent (e.g. coder-1 on story 293 used X tokens)
- [ ] Running totals are visible — either via MCP tool, web UI, or both
- [ ] Historical token usage is persisted so it survives server restarts (e.g. in story files or a separate log)
- [ ] Data is structured to support later analysis (e.g. which agent types are most expensive, which stories cost the most)
## Research Notes
Claude Code's JSON stream already emits all the data we need. No external library required.
**Data available in the `result` event at end of each agent session:**
```json
{
"type": "result",
"total_cost_usd": 1.57,
"usage": {
"input_tokens": 7,
"output_tokens": 475,
"cache_creation_input_tokens": 185020,
"cache_read_input_tokens": 810585
},
"modelUsage": {
"claude-opus-4-6[1m]": {
"inputTokens": 7,
"outputTokens": 475,
"cacheReadInputTokens": 810585,
"cacheCreationInputTokens": 185020,
"costUSD": 1.57
}
}
}
```
**Where to hook in:**
- `server/src/llm/providers/claude_code.rs``process_json_event()` already parses the JSON stream but currently ignores usage data from the `result` event
- Parse `usage` + `total_cost_usd` from the `result` event and pipe it to the agent completion handler in `server/src/agents/pool.rs`
**No external libraries needed** — Anthropic SDK, LiteLLM, Helicone, Langfuse etc. are all overkill since we have direct access to Claude Code's output stream.
## Out of Scope
- TBD