Story 60: Status-Based Directory Layout with work/ pipeline

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Dave
2026-02-20 17:16:48 +00:00
parent 5fc085fd9e
commit e1e0d49759
74 changed files with 102 additions and 418 deletions

View File

@@ -0,0 +1,19 @@
---
name: Directory-Based Workflow Coordination and Locks
test_plan: pending
---
# Story 29: Directory-Based Workflow Coordination and Locks
## User Story
As a user, I want directory-based story workflow coordination with lock tracking, so multiple agents can pick up work with minimal context while keeping coordination in `master`.
## Acceptance Criteria
- Add a `stories/check/` directory for review/verification handoff.
- Define a lock file format in `master` (e.g., `.story_kit/locks.json`) that tracks story assignment, agent identity, worktree path, and last update time.
- Document the story lifecycle across `upcoming/`, `current/`, `check/`, and `archived/` directories.
- Document that code changes happen in worktrees, while coordination files and story movement live in `master`.
## Out of Scope
- Implementing the lock mechanism or agents in code.
- Enforcing locks at runtime.
- Multi-agent orchestration beyond documenting the workflow.

View File

@@ -0,0 +1,32 @@
---
name: Agent Security and Sandboxing
test_plan: pending
---
# Story 34: Agent Security and Sandboxing
## User Story
**As a** supervisor orchestrating multiple autonomous agents,
**I want to** constrain what each agent can access and do,
**So that** agents can't escape their worktree, damage shared state, or perform unintended actions.
## Acceptance Criteria
- [ ] Agent creation accepts an `allowed_tools` list to restrict Claude Code tool access per agent.
- [ ] Agent creation accepts a `disallowed_tools` list as an alternative to allowlisting.
- [ ] Agents without Bash access can still perform useful coding work (Read, Edit, Write, Glob, Grep).
- [ ] Investigate replacing direct Bash/shell access with Rust-implemented tool proxies that enforce boundaries:
- Scoped `exec_shell` that only runs allowlisted commands (e.g., `cargo test`, `npm test`) within the agent's worktree.
- Scoped `read_file` / `write_file` that reject paths outside the agent's worktree root.
- Scoped `git` operations that only work within the agent's worktree.
- [ ] Evaluate `--max-turns` and `--max-budget-usd` as safety limits for runaway agents.
- [ ] Document the trust model: what the supervisor controls vs what agents can do autonomously.
## Questions to Explore
- Can we use MCP (Model Context Protocol) to expose our Rust-implemented tools to Claude Code, replacing its built-in Bash/filesystem tools with scoped versions?
- What's the right granularity for shell allowlists — command-level (`cargo test`) or pattern-level (`cargo *`)?
- Should agents have read access outside their worktree (e.g., to reference shared specs) but write access only within it?
- Is OS-level sandboxing (Docker, macOS sandbox profiles) worth the complexity for a personal tool?
## Out of Scope
- Multi-user authentication or authorization (single-user personal tool).
- Network-level isolation between agents.
- Encrypting agent communication channels (all local).

View File

@@ -0,0 +1,24 @@
---
name: Run button does not start agent
---
# Bug 4: Run Button Does Not Start Agent
## Symptom
Clicking the "Run" button in the AgentPanel does not visibly start an agent. No feedback is shown to the user.
## Root Cause
When multiple agents are configured in `project.toml` (e.g. supervisor, coder-1, coder-2), `handleRunClick` shows a role-selector dropdown instead of starting an agent directly. The dropdown may not be visible due to layout/positioning issues, or the click handler may be swallowed.
## Reproduction Steps
1. Start the server and open the web UI
2. Expand a story in the Agent panel
3. Click the "Run" button
4. Observe: nothing visible happens (no agent starts, no dropdown appears)
## Proposed Fix
Investigate whether the role-selector dropdown is rendering but hidden (z-index, overflow, positioning), or whether the click event is not reaching `handleRunClick`. If the dropdown is the issue, consider starting the default agent directly and offering role selection separately.

View File

@@ -0,0 +1,29 @@
---
name: Deterministic Spike Lifecycle Management
test_plan: pending
---
# Story 51: Deterministic Spike Lifecycle Management
## User Story
As a developer running autonomous agents, I want all spike file mutations to happen through server MCP/REST tools that auto-commit to master, so that spikes are tracked consistently alongside stories and bugs.
## Prerequisites
- Story 49 (Deterministic Bug Lifecycle Management)
- Story 50 (Unified Current Work Directory)
## Acceptance Criteria
- [ ] New MCP tool `create_spike(name, description, goals)` creates a spike file in `.story_kit/spikes/` with a deterministic filename and auto-commits to master
- [ ] New MCP tool `list_spikes()` returns all open spikes (files in `.story_kit/spikes/` excluding `archive/`)
- [ ] New MCP tool `archive_spike(spike_id)` moves a spike from `.story_kit/spikes/` to `.story_kit/spikes/archive/` and auto-commits to master
- [ ] `start_agent` moves spike files into `.story_kit/current/` and auto-commits
- [ ] All auto-commits use deterministic commit messages (e.g. "story-kit: create spike spike-3-explore-foo", "story-kit: archive spike spike-3")
- [ ] Agents never need to edit spike markdown files directly — all mutations go through server tools
## Out of Scope
- Spike-to-story conversion tooling
- Time-boxing or expiry for spikes

View File

@@ -0,0 +1,26 @@
---
name: Mergemaster Agent Role
test_plan: pending
---
# Story 52: Mergemaster Agent Role
## User Story
As a developer, I want a dedicated mergemaster agent that handles the full accept→merge→archive→cleanup pipeline, so that merging coder work to master is deterministic and doesn't require manual conflict resolution.
## Acceptance Criteria
- [ ] New `mergemaster` agent role in `.story_kit/project.toml`
- [ ] Mergemaster can cherry-pick or rebase a worktree branch onto master
- [ ] Mergemaster resolves merge conflicts (or reports them clearly if it can't)
- [ ] Mergemaster runs all quality gates after merge (cargo test, cargo clippy, pnpm test, pnpm build)
- [ ] Mergemaster moves the story/bug from `work/4_merge/` to `work/5_archived/` and auto-commits
- [ ] Mergemaster cleans up the worktree and branch after successful merge
- [ ] MCP tool `merge_agent_work(agent_name, story_id)` triggers the mergemaster pipeline
- [ ] Mergemaster reports success/failure with details (conflicts found, tests passed/failed)
## Out of Scope
- Automated conflict resolution using AI (can follow later — start with simple cherry-pick/rebase)
- Running mergemaster as a persistent daemon

View File

@@ -0,0 +1,42 @@
---
name: QA Agent Role
test_plan: pending
---
# Story 53: QA Agent Role
## User Story
As a developer, I want a dedicated QA agent that reviews coder work in worktrees before merge, so that obvious bugs, quality issues, and missing test coverage are caught before code reaches master.
## Acceptance Criteria
### Code Quality Scan
- [ ] QA agent scans the worktree diff for obvious coding mistakes (unused imports, dead code, unhandled errors, hardcoded values)
- [ ] QA agent runs `cargo clippy --all-targets --all-features` and reports any warnings
- [ ] QA agent runs `pnpm run build` (tsc + vite) and reports any TypeScript errors
- [ ] QA agent runs `biome check` and reports any linting issues
### Test Verification
- [ ] QA agent runs `cargo test` and verifies all tests pass
- [ ] QA agent runs `pnpm run test` and verifies all frontend tests pass
- [ ] QA agent runs coverage collection and reports coverage percentage
- [ ] QA agent reviews test quality — flags tests that are trivial or don't assert meaningful behavior
### Manual Testing Support
- [ ] QA agent builds the server and frontend in the worktree
- [ ] QA agent starts a test server on a free port
- [ ] QA agent generates a testing plan: URL to visit, things to check in the UI, curl commands to exercise endpoints
- [ ] QA agent presents the testing plan to the human via `report_completion` or a new MCP tool
- [ ] Human can approve or reject with feedback
### Agent Configuration
- [ ] New `qa` agent role in `.story_kit/project.toml`
- [ ] MCP tool `request_qa(agent_name, story_id)` triggers QA review of a worktree and moves the item from `work/2_current/` to `work/3_qa/`
- [ ] QA agent produces a structured report (pass/fail per category, details, testing plan)
## Out of Scope
- Automated UI testing (Playwright, Cypress)
- Performance/load testing
- Security scanning

View File

@@ -0,0 +1,19 @@
---
name: Live Story Panel Updates
test_plan: pending
---
# Story 55: Live Story Panel Updates
## User Story
As a user, I want the Upcoming and Review panels to update automatically when stories are created, moved, or archived, so I don't have to manually refresh.
## Acceptance Criteria
- [ ] Server broadcasts a `{"type": "notification", "topic": "stories"}` event over the existing `/ws` WebSocket when a story mutation occurs (create, move to current, archive)
- [ ] UpcomingPanel auto-refreshes its data when it receives a `stories` notification
- [ ] ReviewPanel auto-refreshes its data when it receives a `stories` notification
- [ ] Manual refresh buttons continue to work
- [ ] Panels do not flicker or lose scroll position on auto-refresh
- [ ] End-to-end test: create a story via MCP, verify it appears in the Upcoming panel without manual refresh

View File

@@ -0,0 +1,24 @@
---
name: Auto-Increment Work Item IDs
test_plan: pending
---
# Story 56: Auto-Increment Work Item IDs
## User Story
As a developer, I want the server to automatically assign the next sequential ID when creating stories, bugs, or spikes, so that agents don't pick conflicting numbers and I don't have to deduplicate manually.
## Acceptance Criteria
- [ ] `create_story` scans all stories (upcoming, current, archived) to find the highest existing number and assigns N+1
- [ ] `create_bug` scans all bugs (open and archived) to find the highest existing bug number and assigns N+1
- [ ] `create_spike` scans all spikes (open and archived) to find the highest existing spike number and assigns N+1
- [ ] The `name` parameter no longer needs a number prefix — the server prepends it (e.g. `create_story(name="Foo")``56_foo.md`)
- [ ] Race condition: if two agents create stories simultaneously, they get distinct IDs (simple file-system lock or retry)
- [ ] Existing `create_story` callers (MCP tool, REST API) continue to work with the new behavior
## Out of Scope
- Reserving ID ranges for parallel agents
- Non-numeric IDs

View File

@@ -0,0 +1,19 @@
---
name: Live Test Gate Updates
test_plan: pending
---
# Story 57: Live Test Gate Updates
## User Story
As a user, I want the Gate and Todo panels to update automatically when tests are recorded or acceptance is checked, so I can see progress without manually refreshing.
## Acceptance Criteria
- [ ] Server broadcasts a `{"type": "notification", "topic": "tests"}` event over `/ws` when tests are recorded, acceptance is checked, or coverage is collected
- [ ] GatePanel auto-refreshes its data when it receives a `tests` notification
- [ ] TodoPanel auto-refreshes its data when it receives a `tests` notification
- [ ] Manual refresh buttons continue to work
- [ ] Panels do not flicker or lose scroll position on auto-refresh
- [ ] End-to-end test: record test results via MCP, verify Gate panel updates without manual refresh

View File

@@ -0,0 +1,18 @@
---
name: Live Agent Panel Updates
test_plan: pending
---
# Story 58: Live Agent Panel Updates
## User Story
As a user, I want the Agent panel to update automatically when agents start, complete, or fail, so I can monitor progress without manually refreshing.
## Acceptance Criteria
- [ ] Server broadcasts a `{"type": "notification", "topic": "agents"}` event over `/ws` when an agent is started, completes, or fails
- [ ] AgentPanel auto-refreshes its data when it receives an `agents` notification
- [ ] Manual refresh button continues to work
- [ ] Panel does not flicker or lose scroll position on auto-refresh
- [ ] End-to-end test: start an agent via MCP, verify Agent panel updates without manual refresh

View File

@@ -0,0 +1,25 @@
---
name: Current Work Panel
test_plan: pending
---
# Story 59: Current Work Panel
## User Story
As a user, I want a "Current" panel in the frontend that shows all work items (stories, bugs, spikes) currently being worked on and which coder is assigned to each, so I can see at a glance what's in progress.
## Acceptance Criteria
- [ ] New "Current" panel in the right-side panel area
- [ ] Panel lists all files in `.story_kit/work/2_current/` with their type (story/bug/spike) and name
- [ ] Each item shows which agent/coder is working on it (from agent pool state)
- [ ] Items without an assigned agent show as "unassigned"
- [ ] Panel auto-refreshes when an `agents` or `stories` notification is received (if live notifications exist)
- [ ] REST endpoint `GET /api/workflow/current` returns current work items with agent assignments
- [ ] Panel has a manual refresh button
## Out of Scope
- QA and Merge pipeline panels (follow-up stories)
- Actions from the panel (stop agent, reassign, etc.)