From 170fd538089eb18b6f0c04b2264649f925c9ad53 Mon Sep 17 00:00:00 2001
From: Dave <futurechimp@users.noreply.github.com>
Date: Thu, 19 Mar 2026 09:37:21 +0000
Subject: [PATCH] story-kit: create
 296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation

---
 ...ge_for_cost_visibility_and_optimisation.md | 33 +++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/.story_kit/work/1_backlog/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md b/.story_kit/work/1_backlog/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md
index 007a557..6d2c702 100644
--- a/.story_kit/work/1_backlog/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md
+++ b/.story_kit/work/1_backlog/296_story_track_per_agent_token_usage_for_cost_visibility_and_optimisation.md
@@ -16,6 +16,39 @@ As a project owner, I want to see how many tokens each agent consumes per story,
 - [ ] Historical token usage is persisted so it survives server restarts (e.g. in story files or a separate log)
 - [ ] Data is structured to support later analysis (e.g. which agent types are most expensive, which stories cost the most)
 
+## Research Notes
+
+Claude Code's JSON stream already emits all the data we need. No external library required.
+
+**Data available in the `result` event at end of each agent session:**
+```json
+{
+  "type": "result",
+  "total_cost_usd": 1.57,
+  "usage": {
+    "input_tokens": 7,
+    "output_tokens": 475,
+    "cache_creation_input_tokens": 185020,
+    "cache_read_input_tokens": 810585
+  },
+  "modelUsage": {
+    "claude-opus-4-6[1m]": {
+      "inputTokens": 7,
+      "outputTokens": 475,
+      "cacheReadInputTokens": 810585,
+      "cacheCreationInputTokens": 185020,
+      "costUSD": 1.57
+    }
+  }
+}
+```
+
+**Where to hook in:**
+- `server/src/llm/providers/claude_code.rs` — `process_json_event()` already parses the JSON stream but currently ignores usage data from the `result` event
+- Parse `usage` + `total_cost_usd` from the `result` event and pipe it to the agent completion handler in `server/src/agents/pool.rs`
+
+**No external libraries needed** — Anthropic SDK, LiteLLM, Helicone, Langfuse etc. are all overkill since we have direct access to Claude Code's output stream.
+
 ## Out of Scope
 
 - TBD