Story 60: Status-Based Directory Layout with work/ pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -27,13 +27,13 @@ You have these tools via the story-kit MCP server:
|
||||
- get_agent_output(story_id, agent_name, timeout_ms) - Poll agent output (returns recent events, call repeatedly)
|
||||
- list_agents() - See all running agents and their status
|
||||
- stop_agent(story_id, agent_name) - Stop a running agent
|
||||
- get_story_todos(story_id) - Get unchecked acceptance criteria for a story in current/
|
||||
- get_story_todos(story_id) - Get unchecked acceptance criteria for a story in work/2_current/
|
||||
- ensure_acceptance(story_id) - Check if a story passes acceptance gates
|
||||
|
||||
## Your Workflow
|
||||
1. Read CLAUDE.md and .story_kit/README.md to understand the project and dev process
|
||||
2. Read the story file from .story_kit/stories/ to understand requirements
|
||||
3. Move it to current/ if it is in upcoming/
|
||||
2. Read the story file from .story_kit/work/ to understand requirements
|
||||
3. Move it to work/2_current/ if it is in work/1_upcoming/
|
||||
4. Start coder-1 on the story: call start_agent with story_id="{{story_id}}" and agent_name="coder-1"
|
||||
5. Wait for completion: call wait_for_agent with story_id="{{story_id}}" and agent_name="coder-1". The coder will call report_completion when done, which runs acceptance gates automatically. wait_for_agent returns when the coder reports completion.
|
||||
6. Check the result: inspect the "completion" field in the wait_for_agent response — if gates_passed is true, the work is done; if false, review the gate_output and decide whether to start a fresh coder.
|
||||
@@ -54,7 +54,7 @@ role = "Full-stack engineer. Implements features across all components."
|
||||
model = "sonnet"
|
||||
max_turns = 50
|
||||
max_budget_usd = 5.00
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/stories/ - move it to current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/work/ - move it to work/2_current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. ALWAYS call report_completion as your absolute final action after committing."
|
||||
|
||||
[[agent]]
|
||||
@@ -63,7 +63,7 @@ role = "Full-stack engineer. Implements features across all components."
|
||||
model = "sonnet"
|
||||
max_turns = 50
|
||||
max_budget_usd = 5.00
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/stories/ - move it to current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/work/ - move it to work/2_current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. ALWAYS call report_completion as your absolute final action after committing."
|
||||
|
||||
[[agent]]
|
||||
@@ -72,5 +72,5 @@ role = "Full-stack engineer. Implements features across all components."
|
||||
model = "sonnet"
|
||||
max_turns = 50
|
||||
max_budget_usd = 5.00
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/stories/ - move it to current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. Pick up the story from .story_kit/work/ - move it to work/2_current/ if needed. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: When all your work is committed, call report_completion as your FINAL action: report_completion(story_id='{{story_id}}', agent_name='{{agent_name}}', summary='<brief summary of what you implemented>'). The server will run cargo clippy and tests automatically to verify your work."
|
||||
system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. ALWAYS call report_completion as your absolute final action after committing."
|
||||
|
||||
@@ -1,115 +0,0 @@
|
||||
---
|
||||
name: MCP Server for Workflow API
|
||||
---
|
||||
|
||||
# Spike 1: MCP Server for Workflow API
|
||||
|
||||
## Question
|
||||
|
||||
Can we expose the Story Kit workflow API as MCP tools so that agents call enforced endpoints instead of manipulating files directly?
|
||||
|
||||
## Hypothesis
|
||||
|
||||
A thin stdio MCP server that proxies to the existing Rust HTTP API will let Claude Code agents use `create_story`, `validate_stories`, `record_tests`, and `ensure_acceptance` as native tools — with zero changes to the existing server.
|
||||
|
||||
## Timebox
|
||||
|
||||
2 hours
|
||||
|
||||
## Investigation Plan
|
||||
|
||||
1. Understand the MCP stdio protocol (JSON-RPC over stdin/stdout)
|
||||
2. Identify which workflow endpoints should become MCP tools
|
||||
3. Determine the best language/approach for the MCP server (Rust binary vs Node script vs Rust integrated into existing server)
|
||||
4. Prototype a minimal MCP server with one tool (`create_story`) and test it with `claude mcp add`
|
||||
5. Verify spawned agents (via `claude -p`) inherit MCP tools
|
||||
6. Evaluate whether we can restrict agents from writing to `.story_kit/stories/` directly
|
||||
|
||||
## Findings
|
||||
|
||||
### 1. MCP stdio protocol is simple
|
||||
JSON-RPC 2.0 over stdin/stdout. Three-phase: initialize handshake → tools/list → tools/call. A minimal server needs to handle ~3 message types. No HTTP, no sockets.
|
||||
|
||||
### 2. The `rmcp` Rust crate makes this trivial
|
||||
The official Rust SDK (`rmcp` 0.3) provides `#[tool]` and `#[tool_router]` macros that eliminate boilerplate. A tool is just an async function with typed parameters:
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Deserialize, schemars::JsonSchema)]
|
||||
pub struct CreateStoryRequest {
|
||||
#[schemars(description = "Human-readable story name")]
|
||||
pub name: String,
|
||||
#[schemars(description = "User story text")]
|
||||
pub user_story: Option<String>,
|
||||
#[schemars(description = "List of acceptance criteria")]
|
||||
pub acceptance_criteria: Option<Vec<String>>,
|
||||
}
|
||||
|
||||
#[tool(description = "Create a new story with correct front matter in upcoming/")]
|
||||
async fn create_story(
|
||||
&self,
|
||||
Parameters(req): Parameters<CreateStoryRequest>,
|
||||
) -> Result<CallToolResult, McpError> {
|
||||
let resp = self.client.post(&format!("{}/workflow/stories/create", self.api_url))
|
||||
.json(&req).send().await...;
|
||||
Ok(CallToolResult::success(vec![Content::text(resp.story_id)]))
|
||||
}
|
||||
```
|
||||
|
||||
Dependencies needed: `rmcp` (server, transport-io), `schemars`, `reqwest`, `tokio`, `serde`. We already use most of these in the existing server.
|
||||
|
||||
### 3. Architecture: separate binary, same workspace
|
||||
Best approach is a new binary crate (`story-kit-mcp`) in the workspace that:
|
||||
- Reads the API URL from env or CLI arg (default `http://localhost:3000/api`)
|
||||
- Proxies each MCP tool call to the corresponding HTTP endpoint
|
||||
- Returns the API response as tool output
|
||||
|
||||
This keeps the MCP layer thin and the enforcement logic in the existing server. No code duplication — the MCP binary is just a translation layer.
|
||||
|
||||
### 4. Which endpoints become tools
|
||||
|
||||
| MCP Tool | HTTP Endpoint | Why |
|
||||
|---|---|---|
|
||||
| `create_story` | POST /workflow/stories/create | Enforce front matter |
|
||||
| `validate_stories` | GET /workflow/stories/validate | Check all stories |
|
||||
| `record_tests` | POST /workflow/tests/record | Record test results |
|
||||
| `ensure_acceptance` | POST /workflow/acceptance/ensure | Gate story acceptance |
|
||||
| `collect_coverage` | POST /workflow/coverage/collect | Run + record coverage |
|
||||
| `get_story_todos` | GET /workflow/todos | See remaining work |
|
||||
| `list_upcoming` | GET /workflow/upcoming | See backlog |
|
||||
|
||||
### 5. Configuration via `.mcp.json` (project-scoped)
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"story-kit": {
|
||||
"type": "stdio",
|
||||
"command": "./target/release/story-kit-mcp",
|
||||
"args": ["--api-url", "http://localhost:${STORYKIT_PORT:-3000}/api"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
This gets checked into the repo. Every Claude Code session and every spawned agent inherits it automatically.
|
||||
|
||||
### 6. Agent restrictions
|
||||
Claude Code's `.claude/settings.local.json` can restrict which tools agents have access to. We could:
|
||||
- Give agents the MCP tools (`story-kit:create_story`, etc.)
|
||||
- Restrict or remove Write access to `.story_kit/stories/` paths
|
||||
- This forces agents through the API for all workflow actions
|
||||
|
||||
Caveat: tool restrictions are advisory in `settings.local.json` — agents with Bash access could still `echo > file`. Full enforcement requires removing Bash or scoping it (which is story 35's problem).
|
||||
|
||||
### 7. Effort estimate
|
||||
The MCP binary itself is ~200-300 lines of Rust. One afternoon of work. Most of the time would be testing the integration with agent spawning and worktrees.
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Proceed with a story.** The spike confirms this is straightforward and high-value. The `rmcp` crate handles the protocol complexity, and our existing HTTP API already does the enforcement. The MCP server is just plumbing.
|
||||
|
||||
Suggested story scope:
|
||||
1. New `story-kit-mcp` binary crate in the workspace
|
||||
2. Expose the 7 tools listed above
|
||||
3. Add `.mcp.json` to the project
|
||||
4. Update agent spawn to ensure MCP tools are available in worktrees
|
||||
5. Test: spawn agent, verify it uses MCP tools instead of file writes
|
||||
@@ -1,129 +0,0 @@
|
||||
# Spike: Claude Code Integration via PTY + CLI
|
||||
|
||||
**Question:** Can we run Claude Code programmatically from our Rust backend while using Max subscription billing instead of per-token API billing?
|
||||
|
||||
**Hypothesis:** Spawning `claude -p` inside a pseudo-terminal (PTY) will make `isatty()` return true, causing Claude Code to use Max subscription billing while giving us structured JSON output.
|
||||
|
||||
**Timebox:** 2 hours
|
||||
|
||||
**Result: HYPOTHESIS CONFIRMED**
|
||||
|
||||
---
|
||||
|
||||
## Proof
|
||||
|
||||
Spawning `claude -p "hi" --output-format stream-json --verbose` inside a PTY from Rust (`portable-pty` crate) produces:
|
||||
|
||||
```json
|
||||
{"type":"system","subtype":"init","apiKeySource":"none","model":"claude-opus-4-6",...}
|
||||
{"type":"rate_limit_event","rate_limit_info":{"status":"allowed","rateLimitType":"five_hour",...}}
|
||||
{"type":"assistant","message":{"model":"claude-opus-4-6","content":[{"type":"text","text":"Hi! How can I help you today?"}],...}}
|
||||
{"type":"result","subtype":"success","total_cost_usd":0.0102,...}
|
||||
```
|
||||
|
||||
Key evidence:
|
||||
- **`apiKeySource: "none"`** — not using an API key
|
||||
- **`rateLimitType: "five_hour"`** — Max subscription rate limiting (not per-token)
|
||||
- **`model: "claude-opus-4-6"`** — Opus on Max plan
|
||||
- Clean NDJSON output, parseable from Rust
|
||||
- Response streamed to browser UI via WebSocket
|
||||
|
||||
## Architecture (Proven)
|
||||
|
||||
```
|
||||
Browser UI → WebSocket → Rust Backend → PTY → claude -p --output-format stream-json
|
||||
↑
|
||||
isatty() = true → Max subscription billing
|
||||
```
|
||||
|
||||
## What Works
|
||||
|
||||
1. `portable-pty` crate spawns Claude Code in a PTY from Rust
|
||||
2. `-p` flag gives single-shot non-interactive mode (no TUI)
|
||||
3. `--output-format stream-json` gives clean NDJSON (no ANSI escapes)
|
||||
4. PTY makes `isatty()` return true → Max billing
|
||||
5. NDJSON events parsed and streamed to frontend via WebSocket
|
||||
6. Session IDs returned for potential multi-turn via `--resume`
|
||||
|
||||
## Event Types from stream-json
|
||||
|
||||
| Type | Purpose | Key Fields |
|
||||
|------|---------|------------|
|
||||
| `system` | Init event | `session_id`, `model`, `apiKeySource`, `tools`, `agents` |
|
||||
| `rate_limit_event` | Billing info | `status`, `rateLimitType` |
|
||||
| `assistant` | Claude's response | `message.content[].text` |
|
||||
| `result` | Final summary | `total_cost_usd`, `usage`, `duration_ms` |
|
||||
| `stream_event` | Token deltas (with `--include-partial-messages`) | `event.delta.text` |
|
||||
|
||||
## Multi-Agent Concurrency (Proven)
|
||||
|
||||
Created an `AgentPool` with REST API (`POST /api/agents`, `POST /api/agents/:name/message`, `GET /api/agents`) and tested 2 concurrent coding agents:
|
||||
|
||||
**Test:** Created `coder-1` (frontend role) and `coder-2` (backend role), sent both messages simultaneously.
|
||||
|
||||
```
|
||||
coder-1: Listed 5 React components in 5s (session: ca3e13fc-...)
|
||||
coder-2: Listed 30 Rust source files in 8s (session: 8a815cf0-...)
|
||||
Both: apiKeySource: "none", rateLimitType: "five_hour" (Max billing)
|
||||
```
|
||||
|
||||
**Session resumption confirmed:** Sent coder-1 a follow-up "How many components did you just list?" — it answered "5" using `--resume <session_id>`.
|
||||
|
||||
**What this proves:**
|
||||
- Multiple PTY sessions run concurrently without conflict
|
||||
- Each gets Max subscription billing independently
|
||||
- `--resume` gives agents multi-turn conversation memory
|
||||
- Supervisor pattern works: coordinator reads agent responses, sends coordinated tasks
|
||||
- Inter-agent communication possible via supervisor relay
|
||||
|
||||
**Architecture for multi-agent orchestration:**
|
||||
- Spawn N PTY sessions, each with `claude -p` pointed at a different worktree
|
||||
- Rust backend coordinates work between agents
|
||||
- Different `--model` per agent (Opus for supervisor, Sonnet/Haiku for workers)
|
||||
- `--allowedTools` to restrict what each agent can do
|
||||
- `--max-turns` and `--max-budget-usd` for safety limits
|
||||
|
||||
## Key Flags for Programmatic Use
|
||||
|
||||
```bash
|
||||
claude -p "prompt" # Single-shot mode
|
||||
--output-format stream-json # NDJSON output
|
||||
--verbose # Include all events
|
||||
--include-partial-messages # Token-by-token streaming
|
||||
--model sonnet # Model selection
|
||||
--allowedTools "Read,Edit,Bash" # Tool permissions
|
||||
--permission-mode bypassPermissions # No approval prompts
|
||||
--resume <session_id> # Continue conversation
|
||||
--max-turns 10 # Safety limit
|
||||
--max-budget-usd 5.00 # Cost cap
|
||||
--append-system-prompt "..." # Custom instructions
|
||||
--cwd /path/to/worktree # Working directory
|
||||
```
|
||||
|
||||
## Agent SDK Comparison
|
||||
|
||||
The Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) is a richer TypeScript API with hooks, subagents, and MCP integration — but it **requires an API key** (per-token billing). The PTY approach is the only way to get Max subscription billing programmatically.
|
||||
|
||||
| Factor | PTY + CLI | Agent SDK |
|
||||
|--------|-----------|-----------|
|
||||
| Billing | Max subscription | API key (per-token) |
|
||||
| Language | Any (subprocess) | TypeScript/Python |
|
||||
| Streaming | NDJSON parsing | Native async iterators |
|
||||
| Hooks | Not available | Callback functions |
|
||||
| Subagents | Multiple processes | In-process `agents` option |
|
||||
| Sessions | `--resume` flag | In-memory |
|
||||
| Complexity | Low | Medium (needs Node.js) |
|
||||
|
||||
## Caveats
|
||||
|
||||
- Cost reported in `total_cost_usd` is informational, not actual billing
|
||||
- Concurrent PTY sessions may hit Max subscription rate limits
|
||||
- Each `-p` invocation is a fresh process (startup overhead ~2-3s)
|
||||
- PTY dependency (`portable-pty`) adds ~15 crates
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Story:** Add `--include-partial-messages` for real-time token streaming to browser
|
||||
2. **Story:** Production multi-agent orchestration with worktree isolation per agent
|
||||
3. **Story:** Streaming HTTP responses (SSE) instead of blocking request until agent completes
|
||||
4. **Consider:** Whether Rust backend should become a thin orchestration layer over Claude Code rather than reimplementing agent capabilities
|
||||
@@ -1,26 +0,0 @@
|
||||
---
|
||||
name: Cross-Platform Binary Distribution
|
||||
test_plan: approved
|
||||
---
|
||||
|
||||
# Story 54: Cross-Platform Binary Distribution
|
||||
|
||||
## User Story
|
||||
|
||||
As a developer, I want to build self-contained binaries for macOS and Linux so that I can share Story Kit with others without requiring them to have a Rust toolchain.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `cargo build --release` produces a binary with no non-system dynamic dependencies on macOS (current state — verify)
|
||||
- [ ] CI or a documented process can produce a fully static Linux x86_64 binary using the `x86_64-unknown-linux-musl` target (via cross-compilation or Docker build)
|
||||
- [ ] The Linux binary has zero dynamic library dependencies (`ldd` reports "not a dynamic executable")
|
||||
- [ ] The frontend is embedded in the binary via `rust-embed` (current state — verify still works in release builds)
|
||||
- [ ] A Linux user can download and run the single binary without installing Rust, Node, glibc, or any extra libraries
|
||||
- [ ] Build instructions are documented in the project (e.g. a `Makefile` or `justfile` with `build-linux` / `build-macos` targets)
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- Homebrew formula or package manager publishing
|
||||
- Windows support
|
||||
- Auto-update mechanism
|
||||
- Code signing or notarization
|
||||
Reference in New Issue
Block a user