Commit Graph

38 Commits

Author SHA1 Message Date
Dave
560c731869 story-kit: merge 134_story_add_process_health_monitoring_and_timeout_to_agent_pty_sessions 2026-02-24 13:13:16 +00:00
Dave
c5ddd15273 story-kit: merge 132_story_fix_toctou_race_in_agent_check_and_insert 2026-02-24 12:49:29 +00:00
Dave
b928eace9c story-kit: merge 119_story_mergemaster_should_resolve_merge_conflicts_instead_of_leaving_conflict_markers_on_master 2026-02-23 23:22:24 +00:00
Dave
908fcef353 story-kit: merge 118_bug_agent_pool_retains_stale_running_state_after_completion_blocking_auto_assign 2026-02-23 22:53:23 +00:00
Dave
85fddcb71a story-kit: merge 117_story_show_startup_reconciliation_progress_in_ui 2026-02-23 22:50:57 +00:00
Dave
6d87355577 Merge branch 'feature/story-97_bug_agent_pool_allows_multiple_instances_of_the_same_agent_to_run_concurrently' 2026-02-23 20:53:54 +00:00
Dave
a0f317292c story-kit: merge 93_story_expose_server_logs_to_agents_via_mcp
Adds log_buffer ring buffer and slog! macro for in-memory server log
capture, plus get_server_logs MCP tool for agents to read recent logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 20:53:37 +00:00
Dave
bed46fea1b story-kit: accept 96_story_reset_agent_lozenge_to_idle_state_when_returning_to_roster 2026-02-23 20:52:06 +00:00
Dave
b09b6ce4f1 fix(agents): enforce single-instance concurrency per agent name
The agent pool allowed the same agent (e.g. "qa") to run concurrently
on multiple stories because start_agent() only checked whether that
story+agent combo was already active. It did not check whether the
agent was busy on a different story.

Two concurrent QA runs each spawn cargo clippy + cargo test + vitest,
causing extreme CPU load (load average >33 on M1 Mac).

Fix: before registering a new agent as Pending, scan all active entries
for any Running or Pending entry with the same agent_name. If one is
found, return an error explaining that the story will be picked up when
the agent becomes available.

The existing auto_assign_available_work() mechanism already scans
pipeline directories (3_qa/, 4_merge/, etc.) for unassigned stories
and uses find_free_agent_for_stage() — which respects single-instance
limits — to assign work when an agent slot opens up. So the queuing
behaviour is naturally provided: the story stays in its directory,
and auto-assign picks it up when the previous run completes.

Adds two regression tests:
- start_agent_rejects_when_same_agent_already_running_on_another_story
- start_agent_allows_new_story_when_previous_run_is_completed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 20:46:51 +00:00
Dave
8c6bd4cf74 feat(story-93): expose server logs to agents via get_server_logs MCP tool
- Add log_buffer module: bounded 1000-line ring buffer with push/get_recent API
- Add slog! macro: drop-in for eprintln! that also captures to ring buffer
- Replace all eprintln! calls across agents, watcher, search, chat, worktree, claude_code with slog!
- Add get_server_logs MCP tool: accepts count (1-500) and optional filter params
- 5 unit tests for log_buffer covering push/retrieve, eviction, filtering, count limits, empty buffer
- 262 tests passing, clippy clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 20:38:19 +00:00
Dave
cd902ff219 story-kit: merge 94_bug_stale_agent_state_persists_after_server_restart 2026-02-23 20:38:17 +00:00
Dave
31037f5bf5 Remove test_plan gate from the codebase
The test_plan field was a gate from the old interactive web UI workflow
where a human would approve a test plan before the LLM could write code.
With autonomous coder agents, this gate is dead weight — coders sometimes
obey the README's "wait for approval" instruction and produce no code.

Removes: TestPlanStatus enum, ensure_test_plan_approved checks in fs/shell,
set_test_plan MCP tool + handler, test_plan from story/bug front matter
creation, test_plan validation in validate_story_dirs, and all related tests.
Updates README to remove Step 2 (Test Planning) and renumber steps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 19:12:05 +00:00
Dave
1539e52b19 Inject story content into agent prompts so coders know what to build
The worktree doesn't have .story_kit/work/ so agents had no access to
the story requirements. Read the story file from the project root and
prepend it to the prompt. Without this, coders would start, read
CLAUDE.md, have nothing to implement, and exit with no code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:50:41 +00:00
Dave
225073649b story-kit: start 88_story_auto_assign_agents_to_available_work_on_server_startup 2026-02-23 18:20:24 +00:00
Dave
3f008b7777 Fix invalid model names and preserve worktrees for debugging
model = "sonnet-4.6" is not a valid Claude CLI model identifier,
causing all coder/qa/mergemaster agents to get 404 errors from the
API and exit immediately with no work done. Change to
"claude-sonnet-4-6". Also disable automatic worktree cleanup on
archive so agent work can be inspected post-mortem.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 18:05:26 +00:00
Dave
9bd266eb3f Server-owned agent completion: remove report_completion dependency
When an agent process exits normally, the server now automatically runs
acceptance gates (uncommitted changes check + cargo clippy + tests) and
advances the pipeline based on results. This replaces the previous model
where agents had to explicitly call report_completion as an MCP tool.

Changes:
- Add run_server_owned_completion() free function in agents.rs that runs
  gates on process exit, stores a CompletionReport, and advances pipeline
- Wire it into start_agent's spawned task (replaces simple status setting)
- Remove report_completion from MCP tools list and handler (mcp.rs)
- Update default_agent_prompt() to not reference report_completion
- Update all agent prompts in project.toml (supervisor, coders, qa,
  mergemaster) to reflect server-owned completion
- Add guard: skip gates if completion was already recorded (legacy path)
- Add 4 new tests for server-owned completion behavior
- Update tools_list test (26 tools, report_completion excluded)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 15:00:10 +00:00
Dave
16989a12fc story-kit: merge 69_story_test_coverage_qa_gate 2026-02-23 13:40:12 +00:00
Dave
00b212d7e3 Server drives pipeline as state machine
On agent completion, the server automatically runs script/test and
advances stories through the pipeline: coder → qa → mergemaster →
archive. Failed gates restart the agent with failure context. Agents
no longer need to call pipeline-advancing MCP tools.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 13:13:41 +00:00
Dave
cbd0233e5e story-kit: start 65_story_standardised_script_test_entry_point_for_all_projects 2026-02-23 12:59:55 +00:00
Dave
810608d3d8 Spike 61: filesystem watcher and UI simplification
Add notify-based filesystem watcher for .story_kit/work/ that
auto-commits changes with deterministic messages and broadcasts
events over WebSocket. Push full pipeline state (Upcoming, Current,
QA, To Merge) to frontend on connect and after every watcher event.

Strip dead UI: remove ReviewPanel, GatePanel, TodoPanel,
UpcomingPanel and all associated REST polling. Replace with 4
generic StagePanel components driven by WebSocket. Simplify
AgentPanel to roster-only.

Delete all 11 workflow HTTP endpoints and 16 request/response types
from the server. Clean dead code from workflow module. MCP tools
call Rust functions directly and need none of the HTTP layer.

Net: ~4,100 lines deleted, ~400 added.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 19:39:19 +00:00
Dave
122f481ab9 Story 53: Add QA agent role with request_qa MCP tool
- Add `qa` agent entry to `.story_kit/project.toml` with a detailed
  prompt covering code quality scan, test verification, manual testing
  support, and structured report generation
- Add `move_story_to_qa` function in `agents.rs` that moves a work item
  from `work/2_current/` to `work/3_qa/` and auto-commits (idempotent)
- Add `request_qa` MCP tool in `mcp.rs` that moves the story to
  `work/3_qa/` and starts the QA agent on the existing worktree
- Add unit tests for `move_story_to_qa` (moves, idempotent, error cases)
- Update `tools_list_returns_all_tools` test to expect 27 tools

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 17:45:43 +00:00
Dave
9dab18d597 Story 52: Mergemaster agent role with merge_agent_work MCP tool
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:36:35 +00:00
Dave
e15fbffbb8 Fix 25 tests for work/ directory restructure (story 60)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:24:26 +00:00
Dave
e1e0d49759 Story 60: Status-Based Directory Layout with work/ pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:17:12 +00:00
Dave
2d28304a41 Story 49: Deterministic Bug Lifecycle Management
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:41:04 +00:00
Dave
7f672cae5f Story 50: Unified Current Work Directory
- Move current/ to .story_kit/current/ (out of stories/)
- Type-aware routing for bugs, spikes, stories
- close_bug_to_archive() for bug lifecycle
- All path references updated across agents.rs, workflow.rs, mcp.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 16:25:08 +00:00
Dave
928cc64bfa Story 46: Deterministic Story Mutations with Auto-Commit
- Add git_stage_and_commit() helper for deterministic commits
- move_story_to_current() auto-commits on start_agent
- accept_story auto-commits move to archived/
- New MCP tools: check_criterion, set_test_plan (total: 21)
- create_story MCP always auto-commits
- Tests for check_criterion and set_test_plan

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 15:34:41 +00:00
Dave
5c164f4855 Accept story 45: Deterministic Story Lifecycle Management
- accept_story MCP tool moves current/ to archived/
- move_story_to_archived helper with idempotent behavior
- start_agent auto-moves upcoming/ to current/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 15:09:39 +00:00
Dave
1b71449dd0 Story 44: Agent Completion Report via MCP
- report_completion MCP tool for agents to signal done
- Rejects if worktree has uncommitted changes
- Runs acceptance gates (clippy, tests) automatically
- Stores completion status on agent record
- 10 new tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 15:02:34 +00:00
Dave
a9d45bbcd5 Story 42: Deterministic worktree management via REST/MCP API
Add REST and MCP endpoints for creating, listing, and removing worktrees.
Includes worktree lifecycle management and cleanup operations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 14:22:09 +00:00
Dave
a3c20eb4d4 Accept story 40: MCP Server Obeys STORYKIT_PORT
Agent worktrees now get a .mcp.json written with the correct port from
the running server. AgentPool receives the port at construction and
passes it through to create_worktree, which writes .mcp.json on both
new creation and reuse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 13:24:35 +00:00
Dave
c6a04f5e53 Accept story 41: Agent Completion Notification via MCP
Add wait_for_agent MCP tool that blocks until an agent reaches a terminal
state (completed, failed, stopped). Returns final status with session_id,
worktree_path, and git commits made by the agent.

- Subscribe-before-check pattern avoids race conditions
- Handles lagged receivers, channel closure, and configurable timeout
- Default timeout 5 minutes, includes git log of agent commits in response
- 11 new tests covering all paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 13:16:04 +00:00
Dave
39b67ff754 Story 33: Copy-paste diff commands for agent worktrees
- Add base_branch detection to WorktreeInfo (from project root HEAD)
- Expose base_branch in AgentInfo API response
- Add {{base_branch}} template variable to agent config rendering
- Show git difftool command with copy-to-clipboard in AgentPanel UI
- Add diff command instruction to coder agent prompts
- Add AgentPanel tests for diff command rendering and clipboard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 12:48:50 +00:00
Dave
db2d055f60 Spike 3: Sub-agent infrastructure fixes for multi-agent coordination
- Fix CLAUDECODE env var blocking nested Claude Code sessions
- Add drain-based event_log for reliable get_agent_output polling
- Add non-SSE get_agent_output fallback (critical for MCP tool calls)
- Preserve worktrees on agent stop instead of destroying work
- Reap zombie processes with child.wait() after kill
- Increase broadcast buffer from 256 to 1024
- Engineer supervisor and coder prompts in project.toml
- Point .mcp.json to test port 3002

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 12:14:36 +00:00
Dave
6d57b06636 Accept story 34: Per-Project Agent Configuration and Role Definitions
Replace single [agent] config with multi-agent [[agent]] roster system.
Each agent has name, role, model, allowed_tools, max_turns, max_budget_usd,
and system_prompt fields that map to Claude CLI flags at spawn time.

- AgentConfig expanded with structured fields, validated at startup (panics
  on duplicate names, empty names, non-positive budgets/turns)
- Backwards-compatible: legacy [agent] format auto-wraps with deprecation warning
- AgentPool uses composite "story_id:agent_name" keys for concurrent agents
- agent_name added to AgentEvent variants, AgentInfo, start/stop/subscribe APIs
- GET /agents/config returns roster, POST /agents/config/reload hot-reloads
- POST /agents/start accepts optional agent_name, /agents/stop requires it
- SSE route updated to /agents/:story_id/:agent_name/stream
- Frontend: roster badges, agent selector dropdown, composite-key state
- Project root initialized to cwd at startup so config endpoints work immediately

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 18:46:14 +00:00
Dave
5e5cdd9b2f Accept story 30: Worktree-based agent orchestration
Add git worktree isolation for concurrent story agents. Each agent now
runs in its own worktree with setup/teardown commands driven by
.story_kit/project.toml config. Agents stream output via SSE and support
start/stop lifecycle with Pending/Running/Completed/Failed statuses.

Backend: config.rs (TOML parsing), worktree.rs (git worktree lifecycle),
refactored agents.rs (broadcast streaming), agents_sse.rs (SSE endpoint).
Frontend: AgentPanel.tsx with Run/Stop buttons and streaming output log.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 17:58:53 +00:00
Dave
bf0fb5bcf6 Add story 35: Agent security and sandboxing, add bypassPermissions to agent spawns
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 15:56:05 +00:00
Dave
68a19c393e Spike: PTY-based Claude Code integration with multi-agent concurrency
Proves that spawning `claude -p` in a pseudo-terminal from Rust gets Max
subscription billing (apiKeySource: "none", rateLimitType: "five_hour")
instead of per-token API charges. Concurrent agents run in parallel PTY
sessions with session resumption via --resume for multi-turn conversations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 15:25:22 +00:00