Observed: stories 917, 918, 920, 910 all turn-limit-killed despite producing
real commits. Tally across their session logs shows 30–55% of assistant
turns were pure narration ("I'll read X next", "Now let me check Y") with
no tool_use. At 80 max_turns the effective work budget was ~44 tool calls,
not enough for a typical bug fix's edit + test + check_criterion cycle.
Changes:
- New optional AgentConfig field max_tool_turns. When set the watchdog
uses it instead of max_turns; only assistant messages whose
data.message.content has at least one tool_use block count.
- count_turns_in_log in agents/pool/auto_assign/watchdog/limits.rs
filters on tool_use. Existing test helper write_fake_session_log now
emits tool_use blocks; added write_fake_mixed_session_log for the
narration regression test.
- agents.toml: coders/coder-opus get max_turns=200 (claude-code's own
--max-turns cap, sized to never bite before the watchdog) and
max_tool_turns=80. qa: 120 / 40. mergemaster: 250 / 100. Budgets
unchanged — the dollar cap remains the runaway-loop backstop, with
~$3-5 worst-case waste if an agent narrates indefinitely.
- Two new regression tests:
* watchdog_does_not_count_narration_only_turns: 5 tool + 30 narration
under max_tool_turns=10 stays Running.
* watchdog_max_tool_turns_overrides_max_turns: 4 tool turns at
max_tool_turns=3 / max_turns=200 still terminates with TurnLimit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add `disallowed_tools` field to `AgentConfig` and render it as
`--disallowedTools` CLI flag in `render_agent_args`
- Set `disallowed_tools = ["ScheduleWakeup"]` on all four coder agents
(coder-1, coder-2, coder-3, coder-opus); QA and mergemaster unaffected
- Append instruction to all coder `system_prompt`s: do not use
ScheduleWakeup to wait for run_tests; if run_tests appears to time out,
call run_tests again — it attaches to the in-flight job and blocks
- Add tests: `render_agent_args_disallowed_tools` and
`coder_agents_disallow_schedule_wakeup`