Bump version to 0.8.8

storkit: done 460_bug_strip_bot_mention_fails_on_element_markdown_mention_pill_format
storkit: merge 460_bug_strip_bot_mention_fails_on_element_markdown_mention_pill_format
2026-04-03 11:07:39 +01:00 · 2026-04-03 10:00:54 +00:00 · 2026-04-03 10:00:50 +00:00 · 2026-04-03 09:53:38 +00:00 · 2026-04-03 09:51:18 +00:00 · 2026-04-02 21:06:38 +00:00
675 changed files with 60241 additions and 26307 deletions
@@ -1,12 +1,12 @@
 {
  "enabledMcpjsonServers": [
-    "story-kit"
+    "storkit"
  ],
  "permissions": {
    "allow": [
-      "Bash(./server/target/debug/story-kit:*)",
+      "Bash(./server/target/debug/storkit:*)",
-      "Bash(./target/debug/story-kit:*)",
+      "Bash(./target/debug/storkit:*)",
-      "Bash(STORYKIT_PORT=*)",
+      "Bash(STORKIT_PORT=*)",
      "Bash(cargo build:*)",
      "Bash(cargo check:*)",
      "Bash(cargo clippy:*)",
@@ -56,11 +56,21 @@
      "WebFetch(domain:portkey.ai)",
      "WebFetch(domain:www.shuttle.dev)",
      "WebSearch",
-      "mcp__story-kit__*",
+      "mcp__storkit__*",
      "Edit",
      "Write",
      "Bash(find *)",
-      "Bash(sqlite3 *)"
+      "Bash(sqlite3 *)",
      "Bash(cat <<:*)",
      "Bash(cat <<'ENDJSON:*)",
      "Bash(make release:*)",
      "Bash(npm test:*)",
      "Bash(head *)",
      "Bash(tail *)",
      "Bash(wc *)",
      "Bash(npx vite:*)",
      "Bash(npm run dev:*)",
      "Bash(stat *)"
    ]
  }
 }
@@ -0,0 +1,11 @@
 # Docker build context exclusions
 **/target/
 **/node_modules/
 frontend/dist/
 .storkit/worktrees/
 .storkit/logs/
 .storkit/work/6_archived/
 .git/
 *.swp
 *.swo
 .DS_Store
@@ -5,9 +5,10 @@
 # Local environment (secrets)
 .env
-# App specific (root-level; story-kit subdirectory patterns live in .story_kit/.gitignore)
+# App specific (root-level; storkit subdirectory patterns live in .storkit/.gitignore)
 store.json
-.story_kit_port
+.storkit_port
 .storkit/bot.toml.bak
 # Rust stuff
 target
@@ -3,6 +3,6 @@ frontend/
 node_modules/
 .claude/
 .git/
-.story_kit/
+.storkit/
 store.json
-.story_kit_port
+.storkit_port
@@ -1,6 +1,6 @@
 {
  "mcpServers": {
-    "story-kit": {
+    "storkit": {
      "type": "http",
      "url": "http://localhost:3001/mcp"
    }
@@ -17,3 +17,9 @@ work/4_merge/
 # Coverage reports (generated by cargo-llvm-cov, not tracked in git)
 coverage/
 # Token usage log (generated at runtime, contains cost data)
 token_usage.jsonl
 # Chat service logs
 whatsapp_history.json
@@ -9,16 +9,22 @@
 When you start a new session with this project:
-1. **Check for MCP Tools:** Read `.mcp.json` to discover the MCP server endpoint. Then list available tools by calling:
+1. **Check Setup Wizard:** Call `wizard_status` to check if project setup is complete. If the wizard is not complete, guide the user through the remaining steps. Important rules for the wizard flow:
   - **Be conversational.** Don't show tool names, step numbers, or raw wizard output to the user.
   - **On projects with existing code:** Read the codebase and generate each file, then show the user what you wrote and ask if it looks right.
   - **On bare projects with no code:** Ask the user what they want to build, what language/framework they plan to use, and generate files from their answers.
   - **You must actually generate the files.** The workflow for each step is: (1) call `wizard_generate` with no args to get a hint, (2) write the file content yourself based on the conversation, (3) call `wizard_generate` again with the `content` argument containing the full file body, (4) show the user what you wrote, (5) call `wizard_confirm` (they approve), `wizard_retry` (they want changes), or `wizard_skip` (they want to skip). Do not stop after discussing — follow through and write the files.
   - **Keep moving.** After each step is confirmed, immediately proceed to the next wizard step without waiting for the user to ask.
 2. **Check for MCP Tools:** Read `.mcp.json` to discover the MCP server endpoint. Then list available tools by calling:
   ```bash
-   curl -s "$(jq -r '.mcpServers["story-kit"].url' .mcp.json)" \
+   curl -s "$(jq -r '.mcpServers["storkit"].url' .mcp.json)" \
     -H 'Content-Type: application/json' \
     -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}'
   ```
   This returns the full tool catalog (create stories, spawn agents, record tests, manage worktrees, etc.). Familiarize yourself with the available tools before proceeding. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation.
-2. **Read Context:** Check `.story_kit/specs/00_CONTEXT.md` for high-level project goals.
+3. **Read Context:** Check `.storkit/specs/00_CONTEXT.md` for high-level project goals.
-3. **Read Stack:** Check `.story_kit/specs/tech/STACK.md` for technical constraints and patterns.
+4. **Read Stack:** Check `.storkit/specs/tech/STACK.md` for technical constraints and patterns.
-4. **Check Work Items:** Look at `.story_kit/work/1_upcoming/` and `.story_kit/work/2_current/` to see what work is pending.
+5. **Check Work Items:** Look at `.storkit/work/1_backlog/` and `.storkit/work/2_current/` to see what work is pending.
 ---
@@ -52,7 +58,7 @@ project_root/
  ├── README.md          # This document
  ├── project.toml       # Agent configuration (roles, models, prompts)
  ├── work/              # Unified work item pipeline (stories, bugs, spikes)
-  │   ├── 1_upcoming/    # New work items awaiting implementation
+  │   ├── 1_backlog/    # New work items awaiting implementation
  │   ├── 2_current/     # Work in progress
  │   ├── 3_qa/          # QA review
  │   ├── 4_merge/       # Ready to merge to master
@@ -78,7 +84,7 @@ All work items (stories, bugs, spikes) live in the same `work/` pipeline. Items
 Items move through stages by moving the file between directories:
-`1_upcoming` → `2_current` → `3_qa` → `4_merge` → `5_done` → `6_archived`
+`1_backlog` → `2_current` → `3_qa` → `4_merge` → `5_done` → `6_archived`
 Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
@@ -87,7 +93,7 @@ Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
 The server watches `.story_kit/work/` for changes. When a file is created, moved, or modified, the watcher auto-commits with a deterministic message and broadcasts a WebSocket notification to the frontend. This means:
 *   MCP tools only need to write/move files — the watcher handles git commits
-*   IDE drag-and-drop works (drag a story from `1_upcoming/` to `2_current/`)
+*   IDE drag-and-drop works (drag a story from `1_backlog/` to `2_current/`)
 *   The frontend updates automatically without manual refresh
 ---
@@ -156,7 +162,7 @@ Not everything needs to be a full story. Simple bugs can skip the story process:
 *   Performance issues with known fixes
 ### Bug Process
-1.  **Document Bug:** Create a bug file in `work/1_upcoming/` named `{id}_bug_{slug}.md` with:
+1.  **Document Bug:** Create a bug file in `work/1_backlog/` named `{id}_bug_{slug}.md` with:
    *   **Symptom:** What the user observes
    *   **Root Cause:** Technical explanation (if known)
    *   **Reproduction Steps:** How to trigger the bug
@@ -186,7 +192,7 @@ Not everything needs a story or bug fix. Spikes are time-boxed investigations to
 *   Need to validate performance constraints
 ### Spike Process
-1.  **Document Spike:** Create a spike file in `work/1_upcoming/` named `{id}_spike_{slug}.md` with:
+1.  **Document Spike:** Create a spike file in `work/1_backlog/` named `{id}_spike_{slug}.md` with:
    *   **Question:** What you need to answer
    *   **Hypothesis:** What you expect to be true
    *   **Timebox:** Strict limit for the research
@@ -209,7 +215,7 @@ When the LLM context window fills up (or the chat gets slow/confused):
 1.  **Stop Coding.**
 2.  **Instruction:** Tell the user to open a new chat.
 3.  **Handoff:** The only context the new LLM needs is in the `specs/` folder and `.mcp.json`.
-    *   *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_upcoming/` and `work/2_current/` to see what is pending."
+    *   *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_backlog/` and `work/2_current/` to see what is pending."
 ---
@@ -221,14 +227,36 @@ If a user hands you this document and says "Apply this process to my project":
 1.  **Check for MCP Tools:** Look for `.mcp.json` in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities.
 2.  **Analyze the Request:** Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?").
 3.  **Git Check:** Check if the directory is a git repository (`git status`). If not, run `git init`.
-4.  **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_upcoming/` through `work/6_archived/`).
+4.  **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_backlog/` through `work/6_archived/`).
 5.  **Draft Context:** Write `specs/00_CONTEXT.md` based on the user's answer.
 6.  **Draft Stack:** Write `specs/tech/STACK.md` based on best practices for that language.
 7.  **Wait:** Ask the user for "Story #1".
 ---
-## 6. Code Quality
+## 6. Chat Bot Configuration
 Story Kit includes a chat bot that can be connected to one messaging platform at a time. The bot handles commands, LLM conversations, and pipeline notifications.
 **Only one transport can be active at a time.** To configure the bot, copy the appropriate example file to `.storkit/bot.toml`:
 | Transport | Example file | Webhook endpoint |
 |-----------|-------------|-----------------|
 | Matrix | `bot.toml.matrix.example` | *(uses Matrix sync, no webhook)* |
 | WhatsApp (Meta Cloud API) | `bot.toml.whatsapp-meta.example` | `/webhook/whatsapp` |
 | WhatsApp (Twilio) | `bot.toml.whatsapp-twilio.example` | `/webhook/whatsapp` |
 | Slack | `bot.toml.slack.example` | `/webhook/slack` |
 ```bash
 cp .storkit/bot.toml.matrix.example .storkit/bot.toml
 # Edit bot.toml with your credentials
 ```
 The `bot.toml` file is gitignored (it contains secrets). The example files are checked in for reference.
 ---
 ## 7. Code Quality
 **MANDATORY:** Before completing Step 3 (Verification) of any story, you MUST run all applicable linters, formatters, and test suites and fix ALL errors and warnings. Zero tolerance for warnings or errors.
@@ -0,0 +1,26 @@
 # Matrix Transport
 # Copy this file to bot.toml and fill in your values.
 # Only one transport can be active at a time.
 enabled = true
 transport = "matrix"
 homeserver = "https://matrix.example.com"
 username = "@botname:example.com"
 password = "your-bot-password"
 # List one or more rooms to listen in.
 room_ids = ["!roomid:example.com"]
 # Users allowed to interact with the bot (fail-closed: empty = nobody).
 allowed_users = ["@youruser:example.com"]
 # Bot display name in chat.
 # display_name = "Assistant"
 # Maximum conversation turns to remember per room (default: 20).
 # history_size = 20
 # Rooms where the bot responds to all messages (not just addressed ones).
 # This list is updated automatically when users toggle ambient mode at runtime.
 # ambient_rooms = ["!roomid:example.com"]
@@ -0,0 +1,23 @@
 # Slack Transport
 # Copy this file to bot.toml and fill in your values.
 # Only one transport can be active at a time.
 #
 # Setup:
 #   1. Create a Slack App at api.slack.com/apps
 #   2. Add OAuth scopes: chat:write, chat:update
 #   3. Subscribe to bot events: message.channels, message.groups, message.im
 #   4. Install the app to your workspace
 #   5. Set your webhook URL in Event Subscriptions: https://your-server/webhook/slack
 enabled = true
 transport = "slack"
 slack_bot_token = "xoxb-..."
 slack_signing_secret = "your-signing-secret"
 slack_channel_ids = ["C01ABCDEF"]
 # Bot display name (used in formatted messages).
 # display_name = "Assistant"
 # Maximum conversation turns to remember per channel (default: 20).
 # history_size = 20
@@ -0,0 +1,33 @@
 # WhatsApp Transport (Meta Cloud API)
 # Copy this file to bot.toml and fill in your values.
 # Only one transport can be active at a time.
 #
 # Setup:
 #   1. Create a Meta Business App at developers.facebook.com
 #   2. Add the WhatsApp product
 #   3. Copy your Phone Number ID and generate a permanent access token
 #   4. Register your webhook URL: https://your-server/webhook/whatsapp
 #   5. Set the verify token below to match what you configure in Meta's dashboard
 enabled = true
 transport = "whatsapp"
 whatsapp_provider = "meta"
 whatsapp_phone_number_id = "123456789012345"
 whatsapp_access_token = "EAAx..."
 whatsapp_verify_token = "my-secret-verify-token"
 # Optional: name of the approved Meta message template used for notifications
 # sent outside the 24-hour messaging window (default: "pipeline_notification").
 # whatsapp_notification_template = "pipeline_notification"
 # Bot display name (used in formatted messages).
 # display_name = "Assistant"
 # Maximum conversation turns to remember per user (default: 20).
 # history_size = 20
 # Optional: restrict which phone numbers can interact with the bot.
 # When set, only listed numbers are processed; all others are silently ignored.
 # When absent or empty, all numbers are allowed (open by default).
 # whatsapp_allowed_phones = ["+15551234567", "+15559876543"]
@@ -0,0 +1,29 @@
 # WhatsApp Transport (Twilio)
 # Copy this file to bot.toml and fill in your values.
 # Only one transport can be active at a time.
 #
 # Setup:
 #   1. Sign up at twilio.com
 #   2. Activate the WhatsApp sandbox (Messaging > Try it out > Send a WhatsApp message)
 #   3. Send the sandbox join code from your WhatsApp to the sandbox number
 #   4. Copy your Account SID, Auth Token, and sandbox number below
 #   5. Set your webhook URL in the Twilio console: https://your-server/webhook/whatsapp
 enabled = true
 transport = "whatsapp"
 whatsapp_provider = "twilio"
 twilio_account_sid = "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
 twilio_auth_token = "your_auth_token"
 twilio_whatsapp_number = "+14155238886"
 # Bot display name (used in formatted messages).
 # display_name = "Assistant"
 # Maximum conversation turns to remember per user (default: 20).
 # history_size = 20
 # Optional: restrict which phone numbers can interact with the bot.
 # When set, only listed numbers are processed; all others are silently ignored.
 # When absent or empty, all numbers are allowed (open by default).
 # whatsapp_allowed_phones = ["+15551234567", "+15559876543"]
@@ -0,0 +1,28 @@
 # Problems
 Recurring issues observed during pipeline operation. Review periodically and create stories for systemic problems.
 ## 2026-03-18: Stories graduating to "done" with empty merges (7 of 10)
 Pipeline allows stories to move through coding → QA → merge → done without any actual code changes landing on master. The squash-merge produces an empty diff but the pipeline still marks the story as done. Affected stories: 247, 273, 274, 278, 279, 280, 92. Only 266, 271, 277, and 281 actually shipped code. Root cause: no check that the merge commit contains a non-empty diff. Filed bug 283 for the manual_qa gate issue specifically, but the empty-merge-to-done problem is broader and needs its own fix.
 ## 2026-03-18: Agent committed directly to master instead of worktree
 Multiple agents have committed directly to master instead of their worktree/feature branch:
 - Commit `5f4591f` ("fix: update should_commit_stage test to match 5_done") — likely mergemaster
 - Commit `a32cfbd` ("Add bot-level command registry with help command") — story 285 coder committed code + Cargo.lock directly to master
 Agents should only commit to their feature branch or merge-queue branch, never to master directly. Suspect agents are running `git commit` in the project root instead of the worktree directory. This can also revert uncommitted fixes on master (e.g. project.toml pkill fix was overwritten). Frequency: at least 2 confirmed cases. This is a recurring and serious problem — needs a guard in the server or agent prompts.
 ## 2026-03-19: Auto-assign re-assigns mergemaster to failed merge stories in a loop
 After bug 295 fix (`auto_assign_available_work` after every pipeline advance), mergemaster gets re-assigned to stories that already have a merge failure flag. Story 310 had an empty diff merge failure — mergemaster correctly reported the failure, but auto-assign immediately re-assigned mergemaster to the same story, creating an infinite retry loop. The auto-assign logic needs to check for the `merge_failure` front matter flag before re-assigning agents to stories in `4_merge/`.
 ## 2026-03-19: Coder produces no code (complete ghost — story 310)
 Story 310 (Bot delete command) went through the full pipeline — coder session ran, passed QA/gates, moved to merge — but the coder produced zero code. No commits on the feature branch, no commits on master. The entire agent session was a no-op. This is different from the "committed to master instead of worktree" problem — in this case, the coder simply did nothing. Need to investigate the coder logs to understand what happened. The empty-diff merge check would catch this at merge time, but ideally the server should detect "coder finished with no commits on feature branch" at the gate-check stage and fail early.
 ## 2026-03-19: Auto-assign assigns mergemaster to coding-stage stories
 Auto-assign picked mergemaster for story 310 which was in `2_current/`. Mergemaster should only work on stories in `4_merge/`. The `auto_assign_available_work` function doesn't enforce that the agent's configured stage matches the pipeline stage of the story it's being assigned to. Story 279 (auto-assign respects agent stage from front matter) was supposed to fix this, but the check may only apply to front-matter preferences, not the fallback assignment path.
@@ -1,7 +1,27 @@
 # Project-wide default QA mode: "server", "agent", or "human".
 # Per-story `qa` front matter overrides this setting.
 default_qa = "server"
 # Default model for coder agents. Only agents with this model are auto-assigned.
 # Opus coders are reserved for explicit per-story `agent:` front matter requests.
 default_coder_model = "sonnet"
 # Maximum concurrent coder agents. Stories wait in 2_current/ when all slots are full.
 max_coders = 3
 # Maximum retries per story per pipeline stage before marking as blocked.
 # Set to 0 to disable retry limits.
 max_retries = 3
 # Base branch name for this project. Worktree creation, merges, and agent prompts
 # use this value for {{base_branch}}. When not set, falls back to auto-detection
 # (reads current HEAD branch).
 base_branch = "master"
 [[component]]
 name = "frontend"
 path = "frontend"
-setup = ["npm install", "npm run build"]
+setup = ["npm ci", "npm run build"]
 teardown = []
 [[component]]
@@ -10,45 +30,6 @@ path = "."
 setup = ["mkdir -p frontend/dist", "cargo check"]
 teardown = []
 [[agent]]
 name = "supervisor"
 stage = "other"
 role = "Coordinates work, reviews PRs, decomposes stories."
 model = "opus"
 max_turns = 200
 max_budget_usd = 15.00
 prompt = """You are the supervisor for story {{story_id}}. Your job is to coordinate coder agents to implement this story.
 Read CLAUDE.md first, then .story_kit/README.md to understand the dev process (SDTW). You are responsible for ensuring coders follow this process.
 ## Your MCP Tools
 You have these tools via the story-kit MCP server:
 - start_agent(story_id, agent_name) - Start a coder agent on a story
 - wait_for_agent(story_id, agent_name, timeout_ms) - Block until the agent reaches a terminal state (completed/failed). Returns final status including completion report with gates_passed.
 - get_agent_output(story_id, agent_name, timeout_ms) - Poll agent output (returns recent events, call repeatedly)
 - list_agents() - See all running agents and their status
 - stop_agent(story_id, agent_name) - Stop a running agent
 - get_story_todos(story_id) - Get unchecked acceptance criteria for a story in work/2_current/
 - ensure_acceptance(story_id) - Check if a story passes acceptance gates
 ## Your Workflow
 1. Read CLAUDE.md and .story_kit/README.md to understand the project and dev process
 2. Read the story file from .story_kit/work/ to understand requirements
 3. Move it to work/2_current/ if it is in work/1_upcoming/
 4. Start coder-1 on the story: call start_agent with story_id="{{story_id}}" and agent_name="coder-1"
 5. Wait for completion: call wait_for_agent with story_id="{{story_id}}" and agent_name="coder-1". The server automatically runs acceptance gates (cargo clippy + tests) when the coder process exits. wait_for_agent returns when the coder reaches a terminal state.
 6. Check the result: inspect the "completion" field in the wait_for_agent response — if gates_passed is true, the work is done; if false, review the gate_output and decide whether to start a fresh coder.
 7. If the agent gets stuck, stop it and start a fresh agent.
 8. STOP here. Do NOT accept the story or merge to master. Report the status to the human for final review and acceptance.
 ## Rules
 - Do NOT implement code yourself - delegate to coder agents
 - Only run one coder at a time per story
 - Focus on coordination, monitoring, and quality review
 - Never accept stories or merge to master - that is the human's job
 - Your job ends when the coder's completion report shows gates_passed=true and you have reported the result"""
 system_prompt = "You are a supervisor agent. Read CLAUDE.md and .story_kit/README.md first to understand the project dev process. Use MCP tools to coordinate sub-agents. Never implement code directly - always delegate to coder agents and monitor their progress. Use wait_for_agent to block until the coder finishes — the server automatically runs acceptance gates when the agent process exits. Never accept stories or merge to master - get all gates green and report to the human."
 [[agent]]
 name = "coder-1"
 stage = "coder"
@@ -57,7 +38,7 @@ model = "sonnet"
 max_turns = 50
 max_budget_usd = 5.00
 prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
-system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
+system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
 [[agent]]
 name = "coder-2"
@@ -67,45 +48,77 @@ model = "sonnet"
 max_turns = 50
 max_budget_usd = 5.00
 prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
-system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
+system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
 [[agent]]
 name = "coder-3"
 stage = "coder"
 role = "Full-stack engineer. Implements features across all components."
 model = "sonnet"
 max_turns = 50
 max_budget_usd = 5.00
 prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
 system_prompt = "You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
 [[agent]]
 name = "qa-2"
 stage = "qa"
-role = "Reviews coder work in worktrees: runs quality gates, generates testing plans, and reports findings."
+role = "Reviews coder work in worktrees: runs quality gates, verifies acceptance criteria, and reports findings."
 model = "sonnet"
 max_turns = 40
 max_budget_usd = 4.00
-prompt = """You are the QA agent for story {{story_id}}. Your job is to review the coder's work in the worktree and produce a structured QA report.
+prompt = """You are the QA agent for story {{story_id}}. Your job is to verify the coder's work satisfies the story's acceptance criteria and produce a structured QA report.
 Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
 ## Your Workflow
-### 1. Code Quality Scan
+### 0. Read the Story
- Run `git diff master...HEAD --stat` to see what files changed
+- Read the story file at `.storkit/work/3_qa/{{story_id}}.md`
- Run `git diff master...HEAD` to review the actual changes for obvious coding mistakes (unused imports, dead code, unhandled errors, hardcoded values)
+- Extract every acceptance criterion (the `- [ ]` checkbox lines)
- Run `cargo clippy --all-targets --all-features` and note any warnings
+- Keep this list in mind for Step 3
 ### 1. Deterministic Gates (Prerequisites)
 Run these first — if any fail, reject immediately without proceeding to AC review:
 - Run `cargo clippy --all-targets --all-features` — must show 0 errors, 0 warnings
 - Run `cargo test` and verify all tests pass
 - If a `frontend/` directory exists:
  - Run `npm run build` and note any TypeScript errors
  - Run `npx @biomejs/biome check src/` and note any linting issues
  - Run `npm test` and verify all frontend tests pass
-### 2. Test Verification
+### 2. Code Change Review
- Run `cargo test` and verify all tests pass
+- Run `git diff master...HEAD --stat` to see what files changed
- If `frontend/` exists: run `npm test` and verify all frontend tests pass
+- Run `git diff master...HEAD` to review the actual changes
- Review test quality: look for tests that are trivial or don't assert meaningful behavior
+- Flag any incomplete implementations:
  - `todo!()`, `unimplemented!()`, `panic!()` used as stubs
  - Placeholder strings like "TODO", "FIXME", "not implemented"
  - Empty match arms or arms that just return `Default::default()`
  - Hardcoded values where real logic is expected
 - Note any obvious coding mistakes (unused imports, dead code, unhandled errors)
-### 3. Manual Testing Support
+### 3. Acceptance Criteria Review
 For each AC extracted in Step 0:
 - Review the diff and test files to determine if the code addresses this AC
 - PASS: describe specifically how the code addresses it (which file/function/test)
 - FAIL: explain exactly what is missing or incorrect
 An AC fails if:
 - No code change or test relates to it
 - The implementation is stubbed out (todo!/unimplemented!)
 - A test exists but doesn't actually assert the behaviour described
 ### 4. Manual Testing Support (only if all gates PASS and all ACs PASS)
 - Build the server: run `cargo build` and note success/failure
 - If build succeeds: find a free port (try 3010-3020) and attempt to start the server
 - Generate a testing plan including:
  - URL to visit in the browser
  - Things to check in the UI
  - curl commands to exercise relevant API endpoints
- Kill the test server when done: `pkill -f story-kit || true`
+- Kill the test server when done: `pkill -f 'target.*storkit' || true` (NEVER use `pkill -f storkit` — it kills the vite dev server)
-### 4. Produce Structured Report
+### 5. Produce Structured Report and Verdict
-Print your QA report to stdout before your process exits. The server will automatically run acceptance gates. Use this format:
+Print your QA report to stdout. Then call `approve_qa` or `reject_qa` via the MCP tool based on the overall result. Use this format:
 ```
 ## QA Report for {{story_id}}
@@ -114,27 +127,38 @@ Print your QA report to stdout before your process exits. The server will automa
 - clippy: PASS/FAIL (details)
 - TypeScript build: PASS/FAIL/SKIP (details)
 - Biome lint: PASS/FAIL/SKIP (details)
 - Code review findings: (list any issues found, or "None")
 ### Test Verification
 - cargo test: PASS/FAIL (N tests)
 - npm test: PASS/FAIL/SKIP (N tests)
- Test quality issues: (list any trivial/weak tests, or "None")
+- Incomplete implementations: (list any todo!/unimplemented!/stubs found, or "None")
 - Other code review findings: (list any issues found, or "None")
 ### Acceptance Criteria Review
 - AC: <criterion text>
  Result: PASS/FAIL
  Evidence: <how the code addresses it, or what is missing>
 (repeat for each AC)
 ### Manual Testing Plan
- Server URL: http://localhost:PORT (or "Build failed")
+- Server URL: http://localhost:PORT (or "Skipped — gate/AC failure" or "Build failed")
- Pages to visit: (list)
+- Pages to visit: (list, or "N/A")
- Things to check: (list)
+- Things to check: (list, or "N/A")
- curl commands: (list)
+- curl commands: (list, or "N/A")
 ### Overall: PASS/FAIL
 Reason: (summary of why it passed or the primary reason it failed)
 ```
 After printing the report:
 - If Overall is PASS: call `approve_qa(story_id='{{story_id}}')` via MCP
 - If Overall is FAIL: call `reject_qa(story_id='{{story_id}}', notes='<concise reason>')` via MCP so the coder knows exactly what to fix
 ## Rules
 - Do NOT modify any code — read-only review only
- If the server fails to start, still provide the testing plan with curl commands
+- Gates must pass before AC review — a gate failure is an automatic reject
- The server automatically runs acceptance gates when your process exits"""
+- If any AC is not met, the overall result is FAIL
-system_prompt = "You are a QA agent. Your job is read-only: review code quality, run tests, try to start the server, and produce a structured QA report. Do not modify code. The server automatically runs acceptance gates when your process exits."
+- Always call approve_qa or reject_qa — never leave the story without a verdict"""
 system_prompt = "You are a QA agent. Your job is read-only: run quality gates, verify each acceptance criterion against the diff, and produce a structured QA report. Always call approve_qa or reject_qa via MCP to record your verdict. Do not modify code."
 [[agent]]
 name = "coder-opus"
@@ -144,45 +168,67 @@ model = "opus"
 max_turns = 80
 max_budget_usd = 20.00
 prompt = "You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
-system_prompt = "You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
+system_prompt = "You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
 [[agent]]
 name = "qa"
 stage = "qa"
-role = "Reviews coder work in worktrees: runs quality gates, generates testing plans, and reports findings."
+role = "Reviews coder work in worktrees: runs quality gates, verifies acceptance criteria, and reports findings."
 model = "sonnet"
 max_turns = 40
 max_budget_usd = 4.00
-prompt = """You are the QA agent for story {{story_id}}. Your job is to review the coder's work in the worktree and produce a structured QA report.
+prompt = """You are the QA agent for story {{story_id}}. Your job is to verify the coder's work satisfies the story's acceptance criteria and produce a structured QA report.
 Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
 ## Your Workflow
-### 1. Code Quality Scan
+### 0. Read the Story
- Run `git diff master...HEAD --stat` to see what files changed
+- Read the story file at `.storkit/work/3_qa/{{story_id}}.md`
- Run `git diff master...HEAD` to review the actual changes for obvious coding mistakes (unused imports, dead code, unhandled errors, hardcoded values)
+- Extract every acceptance criterion (the `- [ ]` checkbox lines)
- Run `cargo clippy --all-targets --all-features` and note any warnings
+- Keep this list in mind for Step 3
 ### 1. Deterministic Gates (Prerequisites)
 Run these first — if any fail, reject immediately without proceeding to AC review:
 - Run `cargo clippy --all-targets --all-features` — must show 0 errors, 0 warnings
 - Run `cargo test` and verify all tests pass
 - If a `frontend/` directory exists:
  - Run `npm run build` and note any TypeScript errors
  - Run `npx @biomejs/biome check src/` and note any linting issues
  - Run `npm test` and verify all frontend tests pass
-### 2. Test Verification
+### 2. Code Change Review
- Run `cargo test` and verify all tests pass
+- Run `git diff master...HEAD --stat` to see what files changed
- If `frontend/` exists: run `npm test` and verify all frontend tests pass
+- Run `git diff master...HEAD` to review the actual changes
- Review test quality: look for tests that are trivial or don't assert meaningful behavior
+- Flag any incomplete implementations:
  - `todo!()`, `unimplemented!()`, `panic!()` used as stubs
  - Placeholder strings like "TODO", "FIXME", "not implemented"
  - Empty match arms or arms that just return `Default::default()`
  - Hardcoded values where real logic is expected
 - Note any obvious coding mistakes (unused imports, dead code, unhandled errors)
-### 3. Manual Testing Support
+### 3. Acceptance Criteria Review
 For each AC extracted in Step 0:
 - Review the diff and test files to determine if the code addresses this AC
 - PASS: describe specifically how the code addresses it (which file/function/test)
 - FAIL: explain exactly what is missing or incorrect
 An AC fails if:
 - No code change or test relates to it
 - The implementation is stubbed out (todo!/unimplemented!)
 - A test exists but doesn't actually assert the behaviour described
 ### 4. Manual Testing Support (only if all gates PASS and all ACs PASS)
 - Build the server: run `cargo build` and note success/failure
 - If build succeeds: find a free port (try 3010-3020) and attempt to start the server
 - Generate a testing plan including:
  - URL to visit in the browser
  - Things to check in the UI
  - curl commands to exercise relevant API endpoints
- Kill the test server when done: `pkill -f story-kit || true`
+- Kill the test server when done: `pkill -f 'target.*storkit' || true` (NEVER use `pkill -f storkit` — it kills the vite dev server)
-### 4. Produce Structured Report
+### 5. Produce Structured Report and Verdict
-Print your QA report to stdout before your process exits. The server will automatically run acceptance gates. Use this format:
+Print your QA report to stdout. Then call `approve_qa` or `reject_qa` via the MCP tool based on the overall result. Use this format:
 ```
 ## QA Report for {{story_id}}
@@ -191,27 +237,38 @@ Print your QA report to stdout before your process exits. The server will automa
 - clippy: PASS/FAIL (details)
 - TypeScript build: PASS/FAIL/SKIP (details)
 - Biome lint: PASS/FAIL/SKIP (details)
 - Code review findings: (list any issues found, or "None")
 ### Test Verification
 - cargo test: PASS/FAIL (N tests)
 - npm test: PASS/FAIL/SKIP (N tests)
- Test quality issues: (list any trivial/weak tests, or "None")
+- Incomplete implementations: (list any todo!/unimplemented!/stubs found, or "None")
 - Other code review findings: (list any issues found, or "None")
 ### Acceptance Criteria Review
 - AC: <criterion text>
  Result: PASS/FAIL
  Evidence: <how the code addresses it, or what is missing>
 (repeat for each AC)
 ### Manual Testing Plan
- Server URL: http://localhost:PORT (or "Build failed")
+- Server URL: http://localhost:PORT (or "Skipped — gate/AC failure" or "Build failed")
- Pages to visit: (list)
+- Pages to visit: (list, or "N/A")
- Things to check: (list)
+- Things to check: (list, or "N/A")
- curl commands: (list)
+- curl commands: (list, or "N/A")
 ### Overall: PASS/FAIL
 Reason: (summary of why it passed or the primary reason it failed)
 ```
 After printing the report:
 - If Overall is PASS: call `approve_qa(story_id='{{story_id}}')` via MCP
 - If Overall is FAIL: call `reject_qa(story_id='{{story_id}}', notes='<concise reason>')` via MCP so the coder knows exactly what to fix
 ## Rules
 - Do NOT modify any code — read-only review only
- If the server fails to start, still provide the testing plan with curl commands
+- Gates must pass before AC review — a gate failure is an automatic reject
- The server automatically runs acceptance gates when your process exits"""
+- If any AC is not met, the overall result is FAIL
-system_prompt = "You are a QA agent. Your job is read-only: review code quality, run tests, try to start the server, and produce a structured QA report. Do not modify code. The server automatically runs acceptance gates when your process exits."
+- Always call approve_qa or reject_qa — never leave the story without a verdict"""
 system_prompt = "You are a QA agent. Your job is read-only: run quality gates, verify each acceptance criterion against the diff, and produce a structured QA report. Always call approve_qa or reject_qa via MCP to record your verdict. Do not modify code."
 [[agent]]
 name = "mergemaster"
@@ -0,0 +1,43 @@
 # Example project.toml — copy to .storkit/project.toml and customise.
 # This file is checked in; project.toml itself is gitignored (it may contain
 # instance-specific settings).
 # Project-wide default QA mode: "server", "agent", or "human".
 # Per-story `qa` front matter overrides this setting.
 default_qa = "server"
 # Default model for coder agents. Only agents with this model are auto-assigned.
 # Opus coders are reserved for explicit per-story `agent:` front matter requests.
 default_coder_model = "sonnet"
 # Maximum concurrent coder agents. Stories wait in 2_current/ when all slots are full.
 max_coders = 3
 # Maximum retries per story per pipeline stage before marking as blocked.
 # Set to 0 to disable retry limits.
 max_retries = 2
 # Base branch name for this project. Worktree creation, merges, and agent prompts
 # use this value for {{base_branch}}. When not set, falls back to auto-detection
 # (reads current HEAD branch).
 base_branch = "main"
 [[component]]
 name = "server"
 path = "."
 setup = ["cargo build"]
 teardown = []
 [[agent]]
 name = "coder-1"
 role = "Full-stack engineer"
 stage = "coder"
 model = "sonnet"
 max_turns = 50
 max_budget_usd = 5.00
 prompt = """
 You are working in a git worktree on story {{story_id}}.
 Read CLAUDE.md first, then .storkit/README.md to understand the dev process.
 Run: cd "{{worktree_path}}" && git difftool {{base_branch}}...HEAD
 Commit all your work before your process exits.
 """
@@ -0,0 +1,44 @@
 # Slack Integration Setup
 ## Bot Configuration
 Slack integration is configured via `bot.toml` in the project's `.story_kit/` directory:
 ```toml
 transport = "slack"
 display_name = "Storkit"
 slack_bot_token = "xoxb-..."
 slack_signing_secret = "..."
 slack_channel_ids = ["C01ABCDEF"]
 ```
 ## Slack App Configuration
 ### Event Subscriptions
 1. In your Slack app settings, enable **Event Subscriptions**.
 2. Set the **Request URL** to: `https://<your-host>/webhook/slack`
 3. Subscribe to the `message.channels` and `message.im` bot events.
 ### Slash Commands
 Slash commands provide quick access to pipeline commands without mentioning the bot.
 1. In your Slack app settings, go to **Slash Commands**.
 2. Create the following commands, all pointing to the same **Request URL**: `https://<your-host>/webhook/slack/command`
 | Command | Description |
 |---------|-------------|
 | `/storkit-status` | Show pipeline status and agent availability |
 | `/storkit-cost` | Show token spend: 24h total, top stories, and breakdown |
 | `/storkit-show` | Display the full text of a work item (e.g. `/storkit-show 42`) |
 | `/storkit-git` | Show git status: branch, changes, ahead/behind |
 | `/storkit-htop` | Show system and agent process dashboard |
 All slash command responses are **ephemeral** — only the user who invoked the command sees the response.
 ### OAuth & Permissions
 Required bot token scopes:
 - `chat:write` — send messages
 - `commands` — handle slash commands
@@ -118,8 +118,8 @@ To support both Remote and Local models, the system implements a `ModelProvider`
 Multiple instances can run simultaneously in different worktrees. To avoid port conflicts:
- **Backend:** Set `STORYKIT_PORT` to a unique port (default is 3001). Example: `STORYKIT_PORT=3002 cargo run`
+- **Backend:** Set `STORKIT_PORT` to a unique port (default is 3001). Example: `STORKIT_PORT=3002 cargo run`
- **Frontend:** Run `npm run dev` from `frontend/`. It auto-selects the next unused port. It reads `STORYKIT_PORT` to know which backend to talk to, so export it before running: `export STORYKIT_PORT=3002 && cd frontend && npm run dev`
+- **Frontend:** Run `npm run dev` from `frontend/`. It auto-selects the next unused port. It reads `STORKIT_PORT` to know which backend to talk to, so export it before running: `export STORKIT_PORT=3002 && cd frontend && npm run dev`
 When running in a worktree, use a port that won't conflict with the main instance (3001). Ports 3002+ are good choices.
@@ -0,0 +1,24 @@
 ---
 name: "WhatsApp webhook HMAC signature verification"
 retry_count: 3
 blocked: true
 ---
 # Story 388: WhatsApp webhook HMAC signature verification
 ## User Story
 As a bot operator, I want incoming WhatsApp webhook requests to be cryptographically verified, so that forged requests from unauthorized sources are rejected.
 ## Acceptance Criteria
 - [ ] Meta webhooks: validate X-Hub-Signature-256 HMAC-SHA256 header using the app secret before processing
 - [ ] Twilio webhooks: validate request signature using the auth token before processing
 - [ ] Requests with missing or invalid signatures are rejected with 403 Forbidden
 - [ ] Verification is fail-closed: if signature checking is configured, unsigned requests are rejected
 - [ ] Existing bot.toml config is extended with any needed secrets (e.g. Meta app_secret for HMAC verification)
 - [ ] MUST use audited crypto crates (hmac, sha2, sha1, base64) — no hand-rolled cryptographic primitives
 ## Out of Scope
 - TBD
@@ -0,0 +1,40 @@
 ---
 name: "Fly.io Machines API integration for multi-tenant storkit SaaS"
 ---
 # Spike 408: Fly.io Machines API integration for multi-tenant storkit SaaS
 ## Question
 Can we build a working Rust integration that creates and manages per-tenant Fly.io Machines, attaches volumes, injects Claude credentials, and proxies JWT-authenticated HTTP/WebSocket traffic to the right machine?
 ## Hypothesis
 A thin Rust service using `reqwest` for the Machines API and `axum` for the reverse proxy is sufficient. No heavyweight orchestration framework needed.
 ## Prerequisites
 - Fly.io account with API token (set `FLY_API_TOKEN` env var)
 - Spike 407 findings reviewed
 ## Timebox
 4 hours
 ## Investigation Plan
 - [ ] Create a minimal Rust crate in `spikes/fly_machines/` — do not touch production code
 - [ ] Implement machine lifecycle: create, start, stop, destroy via Fly Machines REST API using `reqwest`
 - [ ] Test attaching a persistent volume to a machine and verify it persists across stop/start
 - [ ] Test secret injection — pass a dummy `credentials.json` as a Fly secret and verify it's readable inside the machine
 - [ ] Sketch the auth proxy: JWT validation → machine lookup → reverse proxy to machine's private IP; verify WebSocket proxying works
 - [ ] Measure actual cold start time for a minimal storkit container image
 - [ ] Document any API quirks, rate limits, or sharp edges discovered during testing
 ## Findings
 - TBD
 ## Recommendation
 - TBD
@@ -0,0 +1,22 @@
 ---
 name: "Multi-account OAuth token rotation on rate limit"
 ---
 # Story 411: Multi-account OAuth token rotation on rate limit
 ## User Story
 As a storkit user with multiple Claude Max subscriptions, I want the system to automatically rotate to a different account when one gets rate limited, so that agents and chat don't stall out waiting for limits to reset.
 ## Acceptance Criteria
 - [ ] OAuth login flow stores credentials per-account (keyed by email), not overwriting previous accounts
 - [ ] GET /oauth/status returns all stored accounts and their status (active, rate-limited, expired)
 - [ ] When the active account hits a rate limit, storkit automatically swaps to the next available account's refresh token, refreshes, and retries
 - [ ] The bot sends a notification in Matrix/WhatsApp when it swaps accounts
 - [ ] If all accounts are rate limited, the bot surfaces a clear message with the time until the earliest reset
 - [ ] A new /oauth/authorize login adds to the account pool rather than replacing the current credentials
 ## Out of Scope
 - TBD
@@ -0,0 +1,24 @@
 ---
 name: "Recheck bot command to re-run gates without restarting agent"
 ---
 # Story 412: Recheck bot command to re-run gates without restarting agent
 ## User Story
 As a user, I want to send `recheck <number>` to the bot so that it re-runs acceptance gates on an existing worktree without spawning a new agent, so I can unblock stories that failed due to environment issues without wasting agent turns.
 ## Acceptance Criteria
 - [ ] recheck command is registered in chat/commands/mod.rs and appears in help output
 - [ ] `recheck <number>` runs run_acceptance_gates on the story's existing worktree
 - [ ] If gates pass, the story advances through the pipeline (same as if a coder completed successfully)
 - [ ] If gates fail, the error output is returned to the user (not silently retried)
 - [ ] If no worktree exists for the story, returns a clear error
 - [ ] Does not spawn a new agent or increment retry_count
 - [ ] Works from all transports (Matrix, WhatsApp, Slack)
 - [ ] Works from web UI slash commands
 ## Out of Scope
 - TBD
@@ -0,0 +1,21 @@
 ---
 name: "Unblock command handles all stuck states not just blocked flag"
 ---
 # Story 435: Unblock command handles all stuck states not just blocked flag
 ## User Story
 As a project owner, I want the unblock command to clear any stuck state on a story — not just the blocked flag — so that I have a single command to unstick stories regardless of why they're stuck.
 ## Acceptance Criteria
 - [ ] Unblock clears merge_failure field in addition to blocked flag
 - [ ] Unblock clears review_hold field
 - [ ] Unblock reports which fields were cleared in the confirmation message
 - [ ] Unblock works on stories in any pipeline stage (backlog, current, qa, merge, done)
 - [ ] If no stuck state is found (no blocked, merge_failure, or review_hold), returns a clear message saying so
 ## Out of Scope
 - TBD
@@ -0,0 +1,26 @@
 ---
 name: "Unify story stuck states into a single status field"
 ---
 # Refactor 436: Unify story stuck states into a single status field
 ## Current State
 - TBD
 ## Desired State
 Replace the separate blocked, merge_failure, and review_hold front matter fields with a single status field (e.g. status: blocked, status: merge_failure, status: review_hold). Simplifies the unblock command, auto-assign checks, and pipeline advance logic.
 ## Acceptance Criteria
 - [ ] Replace blocked: true, merge_failure: string, and review_hold: true with a single status: field in story front matter
 - [ ] Auto-assign checks a single field instead of three separate ones
 - [ ] Pipeline advance and lifecycle code reads/writes the unified status field
 - [ ] Unblock command clears the status field regardless of which stuck state it was
 - [ ] retry_count remains a separate field (it's a counter, not a state)
 - [ ] Migration: existing stories with old fields are handled gracefully on read
 ## Out of Scope
 - TBD
@@ -0,0 +1,31 @@
 ---
 name: "Rename project from \"storkit\" to \"huskies\""
 ---
 # Story 455: Rename project from "storkit" to "huskies"
 ## User Story
 As a project maintainer, I want to rename the project from "storkit" to "huskies" so that the product has its new identity throughout the codebase, tooling, and documentation.
 ## Acceptance Criteria
 - [ ] Rust crate name in server/Cargo.toml changed from 'storkit' to 'huskies'
 - [ ] Binary name changed to 'huskies' (Dockerfile CMD, release script binary names)
 - [ ] Environment variables renamed: STORKIT_PORT → HUSKIES_PORT, STORKIT_HOST → HUSKIES_HOST
 - [ ] Docker service name, container_name, image name, and volume names updated in docker-compose.yml
 - [ ] Docker user/group renamed from 'storkit' to 'huskies' in Dockerfile (groupadd, useradd, home dir /home/huskies/.claude)
 - [ ] MCP server registration renamed from 'storkit' to 'huskies' in scaffold-generated .mcp.json and in server/src/http/mcp/mod.rs serverInfo name
 - [ ] All 35+ MCP tool permission patterns updated from mcp__storkit__* to mcp__huskies__* across code and permission configs
 - [ ] The .storkit/ project directory marker renamed to .huskies/ throughout all Rust source (paths.rs, config.rs, scaffold.rs, watcher.rs, prompts.rs, and all agent/pipeline code)
 - [ ] Release script updated: Gitea repo path dave/storkit → dave/huskies, changelog regex updated to match ^(huskies|storkit|story-kit): for backwards-compatible history parsing, binary artifact names updated
 - [ ] Git commit prefix convention updated from 'storkit:' to 'huskies:' in storkit README and agent prompts
 - [ ] Website updated: page title, headings, and contact email (hello@storkit.dev) if domain changes
 - [ ] README.md updated: all CLI examples use 'huskies' binary name, all .storkit/ references become .huskies/
 - [ ] A migration path exists for existing installs: either storkit auto-detects and migrates .storkit/ → .huskies/, or a migration script (script/migrate) is provided
 - [ ] All Claude Code .mcp.json files in existing worktrees are regenerated via scaffold or migration
 - [ ] Gitea repository renamed from dave/storkit to dave/huskies (external action required, noted in story)
 ## Out of Scope
 - TBD
@@ -0,0 +1,28 @@
 ---
 name: "strip_bot_mention fails on Element markdown mention pill format"
 ---
 # Bug 461: strip_bot_mention fails on Element markdown mention pill format
 ## Description
 When Element sends a message with a mention pill, the plain text body uses Markdown link format: `[@timmy:crashlabs.io](https://matrix.to/#/@timmy:crashlabs.io) status`. The `strip_bot_mention` function in chat/util.rs uses `strip_prefix_ci` which expects the message to start with `@timmy` or the display name. Since the message starts with `[`, all prefix checks fail, the mention is not stripped, and the entire Markdown link becomes the "command name". Deterministic commands like `status`, `help`, etc. are never matched — they fall through to the LLM instead. The `mentions_bot` function works correctly because it uses `contains()` rather than prefix matching, so the bot IS triggered, but the command text extraction is broken.
 ## How to Reproduce
 1. In Element, mention the bot using a mention pill: @botname status. 2. Element sends plain body as `[@bot:server](https://matrix.to/#/@bot:server) status`. 3. Observe that the bot routes to LLM instead of the deterministic status command handler.
 ## Actual Result
 strip_bot_mention returns the original text unchanged. The command name is parsed as the entire Markdown link. No deterministic command matches. Message falls through to LLM.
 ## Expected Result
 strip_bot_mention strips the Markdown mention pill `[...](https://matrix.to/...)` and returns `status`. The deterministic command handler matches and handles it.
 ## Acceptance Criteria
 - [ ] strip_bot_mention in chat/util.rs handles the Markdown mention pill format [display](https://matrix.to/#/@user:server)
 - [ ] Deterministic commands like 'status', 'help', 'overview' work when sent via Element mention pills
 - [ ] Existing plain-text mention formats (@bot:server command, @bot command, BotName command) continue to work
 - [ ] Tests added for Markdown mention pill format in util.rs
@@ -0,0 +1,34 @@
 ---
 name: "strip_bot_mention fails on Element Markdown mention pill format"
 ---
 # Bug 460: strip_bot_mention fails on Element Markdown mention pill format
 ## Description
 When Element sends a mention pill, the plain text `body` field contains a Markdown-style link like `[@timmy:crashlabs.io](https://matrix.to/#/@timmy:crashlabs.io) status`. The `strip_bot_mention` function uses prefix matching, so it tries to match `@timmy:crashlabs.io`, `@timmy`, and `Timmy` against text starting with `[` — none match. The entire message falls through to the LLM as a non-command.
 `mentions_bot` works because it uses `body.contains(full_id)` which finds the MXID embedded inside the Markdown link. But `strip_bot_mention` fails because the text starts with `[`, not `@` or the display name.
 This causes all deterministic bot commands (status, help, ambient, etc.) to be routed to the LLM instead of being handled by the bot when the user uses Element's mention pill (@-autocomplete).
 ## How to Reproduce
 1. In Element, type `@timmy` and use the autocomplete pill to mention the bot
 2. Append a command like `status`
 3. Send the message
 ## Actual Result
 The command falls through to the LLM. The bot logs show no "Handled bot command" entry. The plain body is `[@timmy:crashlabs.io](https://matrix.to/#/@timmy:crashlabs.io) status` which `strip_bot_mention` cannot parse.
 ## Expected Result
 The bot should strip the Markdown mention link wrapper, extract the MXID or display name, and match the command deterministically. `@timmy status` via mention pill should produce the same pipeline status output as typing `@timmy status` manually.
 ## Acceptance Criteria
 - [ ] strip_bot_mention handles Markdown link format `[display](https://matrix.to/#/@user:server) command` and extracts the command text
 - [ ] Deterministic commands (status, help, ambient, etc.) work when invoked via Element mention pill autocomplete
 - [ ] Unit tests cover the Markdown mention pill body format
 - [ ] Existing strip_bot_mention tests still pass (plain @mention and display name formats)
@@ -10,7 +10,7 @@ The `prompt_permission` MCP tool returns plain text ("Permission granted for '..
 ## How to Reproduce
-1. Start the story-kit server and open the web UI
+1. Start the storkit server and open the web UI
 2. Chat with the claude-code-pty model
 3. Ask it to do something that requires a tool NOT in `.claude/settings.json` allow list (e.g. `wc -l /etc/hosts`, or WebFetch to a non-allowed domain)
 4. The permission dialog appears — click Approve
@@ -6,7 +6,7 @@ name: "Retry limit for mergemaster and pipeline restarts"
 ## User Story
-As a developer using story-kit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
+As a developer using storkit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
 ## Acceptance Criteria
@@ -23,7 +23,7 @@ The watcher should periodically check `5_done/` and move items older than 4 hour
 - All MCP tools and pipeline logic that reference `5_archived` need updating to use `5_done`
 - Frontend pipeline display if it shows archived/done items
 - `.story_kit/README.md`: update pipeline stage documentation
- Story 116's init scaffolding: `story-kit init` must create `5_done/` and `6_archived/` directories
+- Story 116's init scaffolding: `storkit init` must create `5_done/` and `6_archived/` directories
 - Any templates or scaffold code that creates the `.story_kit/work/` directory structure
 ## Acceptance Criteria
@@ -35,7 +35,7 @@ The watcher should periodically check `5_done/` and move items older than 4 hour
 - [ ] Existing items in old `5_archived/` are migrated to `6_archived/`
 - [ ] Frontend pipeline display updated if applicable
 - [ ] `.story_kit/README.md` updated to reflect the new pipeline stages
- [ ] `story-kit init` scaffolding creates `5_done/` and `6_archived/` (coordinate with story 116)
+- [ ] `storkit init` scaffolding creates `5_done/` and `6_archived/` (coordinate with story 116)
 ## Out of Scope
--- a/Show More
+++ b/Show More