Story 60: Status-Based Directory Layout with work/ pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 17:16:48 +00:00
parent 5fc085fd9e
commit e1e0d49759
74 changed files with 102 additions and 418 deletions
--- a/.story_kit/work/1_upcoming/29_story_directory_based_workflow_coordination.md
+++ b/.story_kit/work/1_upcoming/29_story_directory_based_workflow_coordination.md
@@ -0,0 +1,19 @@
+---
+name: Directory-Based Workflow Coordination and Locks
+test_plan: pending
+---
+# Story 29: Directory-Based Workflow Coordination and Locks
+
+## User Story
+As a user, I want directory-based story workflow coordination with lock tracking, so multiple agents can pick up work with minimal context while keeping coordination in `master`.
+
+## Acceptance Criteria
+- Add a `stories/check/` directory for review/verification handoff.
+- Define a lock file format in `master` (e.g., `.story_kit/locks.json`) that tracks story assignment, agent identity, worktree path, and last update time.
+- Document the story lifecycle across `upcoming/`, `current/`, `check/`, and `archived/` directories.
+- Document that code changes happen in worktrees, while coordination files and story movement live in `master`.
+
+## Out of Scope
+- Implementing the lock mechanism or agents in code.
+- Enforcing locks at runtime.
+- Multi-agent orchestration beyond documenting the workflow.
--- a/.story_kit/work/1_upcoming/35_story_agent_security_and_sandboxing.md
+++ b/.story_kit/work/1_upcoming/35_story_agent_security_and_sandboxing.md
@@ -0,0 +1,32 @@
+---
+name: Agent Security and Sandboxing
+test_plan: pending
+---
+# Story 34: Agent Security and Sandboxing
+
+## User Story
+**As a** supervisor orchestrating multiple autonomous agents,
+**I want to** constrain what each agent can access and do,
+**So that** agents can't escape their worktree, damage shared state, or perform unintended actions.
+
+## Acceptance Criteria
+- [ ] Agent creation accepts an `allowed_tools` list to restrict Claude Code tool access per agent.
+- [ ] Agent creation accepts a `disallowed_tools` list as an alternative to allowlisting.
+- [ ] Agents without Bash access can still perform useful coding work (Read, Edit, Write, Glob, Grep).
+- [ ] Investigate replacing direct Bash/shell access with Rust-implemented tool proxies that enforce boundaries:
+  - Scoped `exec_shell` that only runs allowlisted commands (e.g., `cargo test`, `npm test`) within the agent's worktree.
+  - Scoped `read_file` / `write_file` that reject paths outside the agent's worktree root.
+  - Scoped `git` operations that only work within the agent's worktree.
+- [ ] Evaluate `--max-turns` and `--max-budget-usd` as safety limits for runaway agents.
+- [ ] Document the trust model: what the supervisor controls vs what agents can do autonomously.
+
+## Questions to Explore
+- Can we use MCP (Model Context Protocol) to expose our Rust-implemented tools to Claude Code, replacing its built-in Bash/filesystem tools with scoped versions?
+- What's the right granularity for shell allowlists — command-level (`cargo test`) or pattern-level (`cargo *`)?
+- Should agents have read access outside their worktree (e.g., to reference shared specs) but write access only within it?
+- Is OS-level sandboxing (Docker, macOS sandbox profiles) worth the complexity for a personal tool?
+
+## Out of Scope
+- Multi-user authentication or authorization (single-user personal tool).
+- Network-level isolation between agents.
+- Encrypting agent communication channels (all local).
--- a/.story_kit/work/1_upcoming/4_bug_run_button_does_not_start_agent.md
+++ b/.story_kit/work/1_upcoming/4_bug_run_button_does_not_start_agent.md
@@ -0,0 +1,24 @@
+---
+name: Run button does not start agent
+---
+
+# Bug 4: Run Button Does Not Start Agent
+
+## Symptom
+
+Clicking the "Run" button in the AgentPanel does not visibly start an agent. No feedback is shown to the user.
+
+## Root Cause
+
+When multiple agents are configured in `project.toml` (e.g. supervisor, coder-1, coder-2), `handleRunClick` shows a role-selector dropdown instead of starting an agent directly. The dropdown may not be visible due to layout/positioning issues, or the click handler may be swallowed.
+
+## Reproduction Steps
+
+1. Start the server and open the web UI
+2. Expand a story in the Agent panel
+3. Click the "Run" button
+4. Observe: nothing visible happens (no agent starts, no dropdown appears)
+
+## Proposed Fix
+
+Investigate whether the role-selector dropdown is rendering but hidden (z-index, overflow, positioning), or whether the click event is not reaching `handleRunClick`. If the dropdown is the issue, consider starting the default agent directly and offering role selection separately.
--- a/.story_kit/work/1_upcoming/51_story_deterministic_spike_lifecycle_management.md
+++ b/.story_kit/work/1_upcoming/51_story_deterministic_spike_lifecycle_management.md
@@ -0,0 +1,29 @@
+---
+name: Deterministic Spike Lifecycle Management
+test_plan: pending
+---
+
+# Story 51: Deterministic Spike Lifecycle Management
+
+## User Story
+
+As a developer running autonomous agents, I want all spike file mutations to happen through server MCP/REST tools that auto-commit to master, so that spikes are tracked consistently alongside stories and bugs.
+
+## Prerequisites
+
+- Story 49 (Deterministic Bug Lifecycle Management)
+- Story 50 (Unified Current Work Directory)
+
+## Acceptance Criteria
+
+- [ ] New MCP tool `create_spike(name, description, goals)` creates a spike file in `.story_kit/spikes/` with a deterministic filename and auto-commits to master
+- [ ] New MCP tool `list_spikes()` returns all open spikes (files in `.story_kit/spikes/` excluding `archive/`)
+- [ ] New MCP tool `archive_spike(spike_id)` moves a spike from `.story_kit/spikes/` to `.story_kit/spikes/archive/` and auto-commits to master
+- [ ] `start_agent` moves spike files into `.story_kit/current/` and auto-commits
+- [ ] All auto-commits use deterministic commit messages (e.g. "story-kit: create spike spike-3-explore-foo", "story-kit: archive spike spike-3")
+- [ ] Agents never need to edit spike markdown files directly — all mutations go through server tools
+
+## Out of Scope
+
+- Spike-to-story conversion tooling
+- Time-boxing or expiry for spikes
--- a/.story_kit/work/1_upcoming/52_story_mergemaster_agent_role.md
+++ b/.story_kit/work/1_upcoming/52_story_mergemaster_agent_role.md
@@ -0,0 +1,26 @@
+---
+name: Mergemaster Agent Role
+test_plan: pending
+---
+
+# Story 52: Mergemaster Agent Role
+
+## User Story
+
+As a developer, I want a dedicated mergemaster agent that handles the full accept→merge→archive→cleanup pipeline, so that merging coder work to master is deterministic and doesn't require manual conflict resolution.
+
+## Acceptance Criteria
+
+- [ ] New `mergemaster` agent role in `.story_kit/project.toml`
+- [ ] Mergemaster can cherry-pick or rebase a worktree branch onto master
+- [ ] Mergemaster resolves merge conflicts (or reports them clearly if it can't)
+- [ ] Mergemaster runs all quality gates after merge (cargo test, cargo clippy, pnpm test, pnpm build)
+- [ ] Mergemaster moves the story/bug from `work/4_merge/` to `work/5_archived/` and auto-commits
+- [ ] Mergemaster cleans up the worktree and branch after successful merge
+- [ ] MCP tool `merge_agent_work(agent_name, story_id)` triggers the mergemaster pipeline
+- [ ] Mergemaster reports success/failure with details (conflicts found, tests passed/failed)
+
+## Out of Scope
+
+- Automated conflict resolution using AI (can follow later — start with simple cherry-pick/rebase)
+- Running mergemaster as a persistent daemon
--- a/.story_kit/work/1_upcoming/53_story_qa_agent_role.md
+++ b/.story_kit/work/1_upcoming/53_story_qa_agent_role.md
@@ -0,0 +1,42 @@
+---
+name: QA Agent Role
+test_plan: pending
+---
+
+# Story 53: QA Agent Role
+
+## User Story
+
+As a developer, I want a dedicated QA agent that reviews coder work in worktrees before merge, so that obvious bugs, quality issues, and missing test coverage are caught before code reaches master.
+
+## Acceptance Criteria
+
+### Code Quality Scan
+- [ ] QA agent scans the worktree diff for obvious coding mistakes (unused imports, dead code, unhandled errors, hardcoded values)
+- [ ] QA agent runs `cargo clippy --all-targets --all-features` and reports any warnings
+- [ ] QA agent runs `pnpm run build` (tsc + vite) and reports any TypeScript errors
+- [ ] QA agent runs `biome check` and reports any linting issues
+
+### Test Verification
+- [ ] QA agent runs `cargo test` and verifies all tests pass
+- [ ] QA agent runs `pnpm run test` and verifies all frontend tests pass
+- [ ] QA agent runs coverage collection and reports coverage percentage
+- [ ] QA agent reviews test quality — flags tests that are trivial or don't assert meaningful behavior
+
+### Manual Testing Support
+- [ ] QA agent builds the server and frontend in the worktree
+- [ ] QA agent starts a test server on a free port
+- [ ] QA agent generates a testing plan: URL to visit, things to check in the UI, curl commands to exercise endpoints
+- [ ] QA agent presents the testing plan to the human via `report_completion` or a new MCP tool
+- [ ] Human can approve or reject with feedback
+
+### Agent Configuration
+- [ ] New `qa` agent role in `.story_kit/project.toml`
+- [ ] MCP tool `request_qa(agent_name, story_id)` triggers QA review of a worktree and moves the item from `work/2_current/` to `work/3_qa/`
+- [ ] QA agent produces a structured report (pass/fail per category, details, testing plan)
+
+## Out of Scope
+
+- Automated UI testing (Playwright, Cypress)
+- Performance/load testing
+- Security scanning
--- a/.story_kit/work/1_upcoming/55_story_live_story_panel_updates.md
+++ b/.story_kit/work/1_upcoming/55_story_live_story_panel_updates.md
@@ -0,0 +1,19 @@
+---
+name: Live Story Panel Updates
+test_plan: pending
+---
+
+# Story 55: Live Story Panel Updates
+
+## User Story
+
+As a user, I want the Upcoming and Review panels to update automatically when stories are created, moved, or archived, so I don't have to manually refresh.
+
+## Acceptance Criteria
+
+- [ ] Server broadcasts a `{"type": "notification", "topic": "stories"}` event over the existing `/ws` WebSocket when a story mutation occurs (create, move to current, archive)
+- [ ] UpcomingPanel auto-refreshes its data when it receives a `stories` notification
+- [ ] ReviewPanel auto-refreshes its data when it receives a `stories` notification
+- [ ] Manual refresh buttons continue to work
+- [ ] Panels do not flicker or lose scroll position on auto-refresh
+- [ ] End-to-end test: create a story via MCP, verify it appears in the Upcoming panel without manual refresh
--- a/.story_kit/work/1_upcoming/56_story_auto_increment_work_item_ids.md
+++ b/.story_kit/work/1_upcoming/56_story_auto_increment_work_item_ids.md
@@ -0,0 +1,24 @@
+---
+name: Auto-Increment Work Item IDs
+test_plan: pending
+---
+
+# Story 56: Auto-Increment Work Item IDs
+
+## User Story
+
+As a developer, I want the server to automatically assign the next sequential ID when creating stories, bugs, or spikes, so that agents don't pick conflicting numbers and I don't have to deduplicate manually.
+
+## Acceptance Criteria
+
+- [ ] `create_story` scans all stories (upcoming, current, archived) to find the highest existing number and assigns N+1
+- [ ] `create_bug` scans all bugs (open and archived) to find the highest existing bug number and assigns N+1
+- [ ] `create_spike` scans all spikes (open and archived) to find the highest existing spike number and assigns N+1
+- [ ] The `name` parameter no longer needs a number prefix — the server prepends it (e.g. `create_story(name="Foo")` → `56_foo.md`)
+- [ ] Race condition: if two agents create stories simultaneously, they get distinct IDs (simple file-system lock or retry)
+- [ ] Existing `create_story` callers (MCP tool, REST API) continue to work with the new behavior
+
+## Out of Scope
+
+- Reserving ID ranges for parallel agents
+- Non-numeric IDs
--- a/.story_kit/work/1_upcoming/57_story_live_test_gate_updates.md
+++ b/.story_kit/work/1_upcoming/57_story_live_test_gate_updates.md
@@ -0,0 +1,19 @@
+---
+name: Live Test Gate Updates
+test_plan: pending
+---
+
+# Story 57: Live Test Gate Updates
+
+## User Story
+
+As a user, I want the Gate and Todo panels to update automatically when tests are recorded or acceptance is checked, so I can see progress without manually refreshing.
+
+## Acceptance Criteria
+
+- [ ] Server broadcasts a `{"type": "notification", "topic": "tests"}` event over `/ws` when tests are recorded, acceptance is checked, or coverage is collected
+- [ ] GatePanel auto-refreshes its data when it receives a `tests` notification
+- [ ] TodoPanel auto-refreshes its data when it receives a `tests` notification
+- [ ] Manual refresh buttons continue to work
+- [ ] Panels do not flicker or lose scroll position on auto-refresh
+- [ ] End-to-end test: record test results via MCP, verify Gate panel updates without manual refresh
--- a/.story_kit/work/1_upcoming/58_story_live_agent_panel_updates.md
+++ b/.story_kit/work/1_upcoming/58_story_live_agent_panel_updates.md
@@ -0,0 +1,18 @@
+---
+name: Live Agent Panel Updates
+test_plan: pending
+---
+
+# Story 58: Live Agent Panel Updates
+
+## User Story
+
+As a user, I want the Agent panel to update automatically when agents start, complete, or fail, so I can monitor progress without manually refreshing.
+
+## Acceptance Criteria
+
+- [ ] Server broadcasts a `{"type": "notification", "topic": "agents"}` event over `/ws` when an agent is started, completes, or fails
+- [ ] AgentPanel auto-refreshes its data when it receives an `agents` notification
+- [ ] Manual refresh button continues to work
+- [ ] Panel does not flicker or lose scroll position on auto-refresh
+- [ ] End-to-end test: start an agent via MCP, verify Agent panel updates without manual refresh
--- a/.story_kit/work/1_upcoming/59_story_current_work_panel.md
+++ b/.story_kit/work/1_upcoming/59_story_current_work_panel.md
@@ -0,0 +1,25 @@
+---
+name: Current Work Panel
+test_plan: pending
+---
+
+# Story 59: Current Work Panel
+
+## User Story
+
+As a user, I want a "Current" panel in the frontend that shows all work items (stories, bugs, spikes) currently being worked on and which coder is assigned to each, so I can see at a glance what's in progress.
+
+## Acceptance Criteria
+
+- [ ] New "Current" panel in the right-side panel area
+- [ ] Panel lists all files in `.story_kit/work/2_current/` with their type (story/bug/spike) and name
+- [ ] Each item shows which agent/coder is working on it (from agent pool state)
+- [ ] Items without an assigned agent show as "unassigned"
+- [ ] Panel auto-refreshes when an `agents` or `stories` notification is received (if live notifications exist)
+- [ ] REST endpoint `GET /api/workflow/current` returns current work items with agent assignments
+- [ ] Panel has a manual refresh button
+
+## Out of Scope
+
+- QA and Merge pipeline panels (follow-up stories)
+- Actions from the panel (stop agent, reassign, etc.)