story-kit: done 296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation

2026-03-19 09:55:28 +00:00
parent 9cdb0d4ea8
commit f325ddf9fe
1 changed files with 54 additions and 0 deletions
--- a/.story_kit/work/5_done/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md
+++ b/.story_kit/work/5_done/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md
@@ -0,0 +1,54 @@
+---
+name: "Track per-agent token usage for cost visibility and optimisation"
+---
+
+# Story 296: Track per-agent token usage for cost visibility and optimisation
+
+## User Story
+
+As a project owner, I want to see how many tokens each agent consumes per story, so that I can identify expensive operations and optimise token usage across the pipeline.
+
+## Acceptance Criteria
+
+- [ ] Implement per-agent token tracking that captures input tokens, output tokens, and cache tokens for each agent run
+- [ ] Token usage is recorded per story and per agent (e.g. coder-1 on story 293 used X tokens)
+- [ ] Running totals are visible — either via MCP tool, web UI, or both
+- [ ] Historical token usage is persisted so it survives server restarts (e.g. in story files or a separate log)
+- [ ] Data is structured to support later analysis (e.g. which agent types are most expensive, which stories cost the most)
+
+## Research Notes
+
+Claude Code's JSON stream already emits all the data we need. No external library required.
+
+**Data available in the `result` event at end of each agent session:**
+```json
+{
+  "type": "result",
+  "total_cost_usd": 1.57,
+  "usage": {
+    "input_tokens": 7,
+    "output_tokens": 475,
+    "cache_creation_input_tokens": 185020,
+    "cache_read_input_tokens": 810585
+  },
+  "modelUsage": {
+    "claude-opus-4-6[1m]": {
+      "inputTokens": 7,
+      "outputTokens": 475,
+      "cacheReadInputTokens": 810585,
+      "cacheCreationInputTokens": 185020,
+      "costUSD": 1.57
+    }
+  }
+}
+```
+
+**Where to hook in:**
+- `server/src/llm/providers/claude_code.rs` — `process_json_event()` already parses the JSON stream but currently ignores usage data from the `result` event
+- Parse `usage` + `total_cost_usd` from the `result` event and pipe it to the agent completion handler in `server/src/agents/pool.rs`
+
+**No external libraries needed** — Anthropic SDK, LiteLLM, Helicone, Langfuse etc. are all overkill since we have direct access to Claude Code's output stream.
+
+## Out of Scope
+
+- TBD