The agent pool allowed the same agent (e.g. "qa") to run concurrently
on multiple stories because start_agent() only checked whether that
story+agent combo was already active. It did not check whether the
agent was busy on a different story.
Two concurrent QA runs each spawn cargo clippy + cargo test + vitest,
causing extreme CPU load (load average >33 on M1 Mac).
Fix: before registering a new agent as Pending, scan all active entries
for any Running or Pending entry with the same agent_name. If one is
found, return an error explaining that the story will be picked up when
the agent becomes available.
The existing auto_assign_available_work() mechanism already scans
pipeline directories (3_qa/, 4_merge/, etc.) for unassigned stories
and uses find_free_agent_for_stage() — which respects single-instance
limits — to assign work when an agent slot opens up. So the queuing
behaviour is naturally provided: the story stays in its directory,
and auto-assign picks it up when the previous run completes.
Adds two regression tests:
- start_agent_rejects_when_same_agent_already_running_on_another_story
- start_agent_allows_new_story_when_previous_run_is_completed
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>