fix(923): watchdog counts only tool-using turns; narration-only turns no longer burn budget

Observed: stories 917, 918, 920, 910 all turn-limit-killed despite producing real commits. Tally across their session logs shows 30–55% of assistant turns were pure narration ("I'll read X next", "Now let me check Y") with no tool_use. At 80 max_turns the effective work budget was ~44 tool calls, not enough for a typical bug fix's edit + test + check_criterion cycle. Changes: - New optional AgentConfig field max_tool_turns. When set the watchdog uses it instead of max_turns; only assistant messages whose data.message.content has at least one tool_use block count. - count_turns_in_log in agents/pool/auto_assign/watchdog/limits.rs filters on tool_use. Existing test helper write_fake_session_log now emits tool_use blocks; added write_fake_mixed_session_log for the narration regression test. - agents.toml: coders/coder-opus get max_turns=200 (claude-code's own --max-turns cap, sized to never bite before the watchdog) and max_tool_turns=80. qa: 120 / 40. mergemaster: 250 / 100. Budgets unchanged — the dollar cap remains the runaway-loop backstop, with ~$3-5 worst-case waste if an agent narrates indefinitely. - Two new regression tests: * watchdog_does_not_count_narration_only_turns: 5 tool + 30 narration under max_tool_turns=10 stays Running. * watchdog_max_tool_turns_overrides_max_turns: 4 tool turns at max_tool_turns=3 / max_turns=200 still terminates with TurnLimit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:25:11 +01:00
parent ce07c4d7b7
commit 6feb68f3e3
5 changed files with 197 additions and 22 deletions
@@ -3,7 +3,8 @@ name = "coder-1"
 stage = "coder"
 role = "Full-stack engineer. Implements features across all components."
 model = "sonnet"
-max_turns = 80
+max_turns = 200
+max_tool_turns = 80
 max_budget_usd = 5.00
 disallowed_tools = ["ScheduleWakeup"]
 prompt ="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .huskies/README.md for the dev process, .huskies/specs/00_CONTEXT.md for what this project does, and .huskies/specs/tech/STACK.md for the tech stack and source map. The story details are in your prompt above. The worktree and feature branch already exist - do not create them.\n\n## Your workflow\n1. Read the story and understand the acceptance criteria.\n2. Implement the changes.\n3. As you complete each acceptance criterion, call check_criterion MCP tool to mark it done.\n4. Run the run_tests MCP tool. It blocks server-side until tests finish (up to 20 minutes) and returns the full result. Do NOT call get_test_result — run_tests already gives you the pass/fail outcome.\n5. If tests fail, fix the failures and run run_tests again. Do not commit until tests pass.\n6. Once tests pass, commit your work with a descriptive message and exit.\n\nDo NOT accept stories, move them between stages, or merge to master. The server handles all of that after you exit.\n\n## Bug Workflow: Trust the Story, Act Fast\nWhen working on bugs:\n1. READ THE STORY DESCRIPTION FIRST. If it specifies exact files, functions, and line numbers — go directly there and make the fix.\n2. If the story does NOT specify the exact location, investigate with targeted grep.\n3. Fix with a surgical, minimal change.\n4. Run tests, fix failures, commit and exit.\n5. Write commit messages that explain what broke and why."
@@ -14,7 +15,8 @@ name = "coder-2"
 stage = "coder"
 role = "Full-stack engineer. Implements features across all components."
 model = "sonnet"
-max_turns = 80
+max_turns = 200
+max_tool_turns = 80
 max_budget_usd = 5.00
 disallowed_tools = ["ScheduleWakeup"]
 prompt ="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .huskies/README.md for the dev process, .huskies/specs/00_CONTEXT.md for what this project does, and .huskies/specs/tech/STACK.md for the tech stack and source map. The story details are in your prompt above. The worktree and feature branch already exist - do not create them.\n\n## Your workflow\n1. Read the story and understand the acceptance criteria.\n2. Implement the changes.\n3. As you complete each acceptance criterion, call check_criterion MCP tool to mark it done.\n4. Run the run_tests MCP tool. It blocks server-side until tests finish (up to 20 minutes) and returns the full result. Do NOT call get_test_result — run_tests already gives you the pass/fail outcome.\n5. If tests fail, fix the failures and run run_tests again. Do not commit until tests pass.\n6. Once tests pass, commit your work with a descriptive message and exit.\n\nDo NOT accept stories, move them between stages, or merge to master. The server handles all of that after you exit.\n\n## Bug Workflow: Trust the Story, Act Fast\nWhen working on bugs:\n1. READ THE STORY DESCRIPTION FIRST. If it specifies exact files, functions, and line numbers — go directly there and make the fix.\n2. If the story does NOT specify the exact location, investigate with targeted grep.\n3. Fix with a surgical, minimal change.\n4. Run tests, fix failures, commit and exit.\n5. Write commit messages that explain what broke and why."
@@ -25,7 +27,8 @@ name = "coder-3"
 stage = "coder"
 role = "Full-stack engineer. Implements features across all components."
 model = "sonnet"
-max_turns = 80
+max_turns = 200
+max_tool_turns = 80
 max_budget_usd = 5.00
 disallowed_tools = ["ScheduleWakeup"]
 prompt ="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .huskies/README.md for the dev process, .huskies/specs/00_CONTEXT.md for what this project does, and .huskies/specs/tech/STACK.md for the tech stack and source map. The story details are in your prompt above. The worktree and feature branch already exist - do not create them.\n\n## Your workflow\n1. Read the story and understand the acceptance criteria.\n2. Implement the changes.\n3. As you complete each acceptance criterion, call check_criterion MCP tool to mark it done.\n4. Run the run_tests MCP tool. It blocks server-side until tests finish (up to 20 minutes) and returns the full result. Do NOT call get_test_result — run_tests already gives you the pass/fail outcome.\n5. If tests fail, fix the failures and run run_tests again. Do not commit until tests pass.\n6. Once tests pass, commit your work with a descriptive message and exit.\n\nDo NOT accept stories, move them between stages, or merge to master. The server handles all of that after you exit.\n\n## Bug Workflow: Trust the Story, Act Fast\nWhen working on bugs:\n1. READ THE STORY DESCRIPTION FIRST. If it specifies exact files, functions, and line numbers — go directly there and make the fix.\n2. If the story does NOT specify the exact location, investigate with targeted grep.\n3. Fix with a surgical, minimal change.\n4. Run tests, fix failures, commit and exit.\n5. Write commit messages that explain what broke and why."
@@ -36,7 +39,8 @@ name = "qa-2"
 stage = "qa"
 role = "Reviews coder work in worktrees: runs quality gates, verifies acceptance criteria, and reports findings."
 model = "sonnet"
-max_turns = 40
+max_turns = 120
+max_tool_turns = 40
 max_budget_usd = 4.00
 prompt = """You are the QA agent for story {{story_id}}. Your job is to verify the coder's work satisfies the story's acceptance criteria and produce a structured QA report.

@@ -127,7 +131,8 @@ name = "coder-opus"
 stage = "coder"
 role = "Senior full-stack engineer for complex tasks. Implements features across all components."
 model = "opus"
-max_turns = 80
+max_turns = 200
+max_tool_turns = 80
 max_budget_usd = 20.00
 disallowed_tools = ["ScheduleWakeup"]
 prompt ="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .huskies/README.md for the dev process, .huskies/specs/00_CONTEXT.md for what this project does, and .huskies/specs/tech/STACK.md for the tech stack and source map. The story details are in your prompt above. The worktree and feature branch already exist - do not create them.\n\n## Your workflow\n1. Read the story and understand the acceptance criteria.\n2. Implement the changes.\n3. As you complete each acceptance criterion, call check_criterion MCP tool to mark it done.\n4. Run the run_tests MCP tool. It blocks server-side until tests finish (up to 20 minutes) and returns the full result. Do NOT call get_test_result — run_tests already gives you the pass/fail outcome.\n5. If tests fail, fix the failures and run run_tests again. Do not commit until tests pass.\n6. Once tests pass, commit your work with a descriptive message and exit.\n\nDo NOT accept stories, move them between stages, or merge to master. The server handles all of that after you exit.\n\n## Bug Workflow: Trust the Story, Act Fast\nWhen working on bugs:\n1. READ THE STORY DESCRIPTION FIRST. If it specifies exact files, functions, and line numbers — go directly there and make the fix.\n2. If the story does NOT specify the exact location, investigate with targeted grep.\n3. Fix with a surgical, minimal change.\n4. Run tests, fix failures, commit and exit.\n5. Write commit messages that explain what broke and why."
@@ -138,7 +143,8 @@ name = "qa"
 stage = "qa"
 role = "Reviews coder work in worktrees: runs quality gates, verifies acceptance criteria, and reports findings."
 model = "sonnet"
-max_turns = 40
+max_turns = 120
+max_tool_turns = 40
 max_budget_usd = 4.00
 prompt = """You are the QA agent for story {{story_id}}. Your job is to verify the coder's work satisfies the story's acceptance criteria and produce a structured QA report.

@@ -229,7 +235,8 @@ name = "mergemaster"
 stage = "mergemaster"
 role = "Merges completed coder work into master, runs quality gates, archives stories, and cleans up worktrees."
 model = "opus"
-max_turns = 100
+max_turns = 250
+max_tool_turns = 100
 max_budget_usd = 25.00
 inactivity_timeout_secs = 900
 prompt = """You are the mergemaster agent for story {{story_id}}. Your job is to merge the completed coder work into master.