The agent pool allowed the same agent (e.g. "qa") to run concurrently
on multiple stories because start_agent() only checked whether that
story+agent combo was already active. It did not check whether the
agent was busy on a different story.
Two concurrent QA runs each spawn cargo clippy + cargo test + vitest,
causing extreme CPU load (load average >33 on M1 Mac).
Fix: before registering a new agent as Pending, scan all active entries
for any Running or Pending entry with the same agent_name. If one is
found, return an error explaining that the story will be picked up when
the agent becomes available.
The existing auto_assign_available_work() mechanism already scans
pipeline directories (3_qa/, 4_merge/, etc.) for unassigned stories
and uses find_free_agent_for_stage() — which respects single-instance
limits — to assign work when an agent slot opens up. So the queuing
behaviour is naturally provided: the story stays in its directory,
and auto-assign picks it up when the previous run completes.
Adds two regression tests:
- start_agent_rejects_when_same_agent_already_running_on_another_story
- start_agent_allows_new_story_when_previous_run_is_completed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sonnet 4.6 is too slow for small stories — agents burn through turns
without completing. Reverting coders, QA, and mergemaster to Sonnet 4.5.
Supervisor and coder-opus remain on Opus.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The --directory flag does not exist in Claude Code CLI. It was added in
c169cfc but caused every agent spawn to exit immediately with "unknown
option", resulting in Session: None errors. The process cwd (set via
cmd.cwd()) already correctly pins agents to the worktree directory.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test_plan field was a gate from the old interactive web UI workflow
where a human would approve a test plan before the LLM could write code.
With autonomous coder agents, this gate is dead weight — coders sometimes
obey the README's "wait for approval" instruction and produce no code.
Removes: TestPlanStatus enum, ensure_test_plan_approved checks in fs/shell,
set_test_plan MCP tool + handler, test_plan from story/bug front matter
creation, test_plan validation in validate_story_dirs, and all related tests.
Updates README to remove Step 2 (Test Planning) and renumber steps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The worktree doesn't have .story_kit/work/ so agents had no access to
the story requirements. Read the story file from the project root and
prepend it to the prompt. Without this, coders would start, read
CLAUDE.md, have nothing to implement, and exit with no code.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code resolves its project root by walking up from cwd looking
for .git. In worktrees, .git is a file pointing back to the main
checkout, so Claude Code would resolve the main repo as its project
and write files there instead of in the worktree. Adding --directory
explicitly pins it to the worktree path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>