Skip the version bump commit if nothing changed, so re-running
script/release for the same version doesn't fail on empty commit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The changelog grep commands return exit code 1 when no commits match,
which set -euo pipefail treats as fatal. Add || true guards so the
script continues to the tag/push/release steps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After the cherry-pick step in run_squash_merge, verify:
1. project_root is on the base branch (not a merge-queue branch)
2. HEAD commit has actual code changes (not an empty/story-only diff)
If either check fails, return success=false so the story stays in merge
stage for retry instead of being phantom-advanced to done.
Also rename move_story_to_archived → move_story_to_done.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Story 424's merge used the wrong variant name HardBlock instead of
RateLimitHardBlock, breaking master compilation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The initial commit added the `throttled` field to `StoryAgent` but missed
several construction sites in lifecycle.rs, test_helpers.rs, and scan.rs.
Also adds the `HardBlock` match arm in the WebSocket event conversion and
minor CSS/import ordering fixes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add HardBlock variant to WatcherEvent (story_id, agent_name, reset_time)
- In pty.rs, distinguish allowed_warning (throttle) from hard blocks;
emit RateLimitWarning for throttles, HardBlock for actual 429s
- Add `throttled: bool` field to StoryAgent / AgentInfo
- Pool spawns a background listener that sets throttled=true on
RateLimitWarning or HardBlock events and fires AgentStateChanged
- Status command shows traffic-light dots: ○ idle, ● running, ◑ throttled, ✗ blocked
- Read blocked flag from story front matter for the ✗ dot
- Notifications: RateLimitWarning silenced (too noisy); HardBlock sends
urgent chat notification with optional reset time
- Tests added for traffic_light_dot, read_story_blocked, status output,
and all notification paths
The new WatcherEvent::RateLimitHardBlock variant added in the feature
commit was not covered in the ws.rs From<WatcherEvent> match, causing
a compile error. Add the missing arm returning None (same as
RateLimitWarning — handled by chat notifications only, not WebSocket).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `unblock` bot command (chat + web UI slash command) that clears the
`blocked` flag and resets `retry_count` to 0 in story front matter
- Works across all pipeline stages (1_backlog through 6_archived)
- Returns confirmation with story name and ID, or clear error if story
is not found or not blocked
- Expose `unblock_story` MCP tool for programmatic use by agents
- Make `chat::commands::unblock` module pub(crate) so story_tools can
call `unblock_by_number`
- Add 8 unit tests covering registration, validation, core logic, and
edge cases (not-found, not-blocked, any stage, story ID in response)
- Update MCP tools list test: 49 → 50 tools
Add three HTTP endpoints for OAuth login without terminal access:
- GET /oauth/authorize — generates PKCE params, redirects to
claude.com/cai/oauth/authorize with code=true and full scopes
- GET /callback — exchanges auth code for tokens via JSON POST to
platform.claude.com/v1/oauth/token, writes ~/.claude/.credentials.json
- GET /oauth/status — returns current credential state as JSON
Uses SHA-256 (sha2 crate) for PKCE code challenge. The authorize URL
targets claude.com/cai/ (not platform.claude.com) which is required
for Max/Pro subscriptions to grant user:inference scope.
Users visit http://localhost:3001/oauth/authorize in their browser
to authenticate. Matrix/WhatsApp can send this link when auth fails.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detect authentication_failed errors from the Claude Code PTY stream
and automatically refresh the OAuth access token using the stored
refresh token in ~/.claude/.credentials.json.
- New module server/src/llm/oauth.rs: reads credentials, calls
platform.claude.com/v1/oauth/token with JSON body, writes back
- PTY provider detects "error":"authentication_failed" via AtomicBool
- chat_stream retries once after successful refresh
- Clear error message if refresh also fails
On success the retry is transparent. On failure the user sees:
"OAuth session expired. Please run claude login to re-authenticate."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Archive completed bug 397
- Create story 399 (CLI port flag)
- Remove @biomejs/cli-darwin-arm64 and @rollup/rollup-darwin-arm64
from package.json (breaks Docker builds on Linux)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New projects now get bot.toml.matrix.example, bot.toml.whatsapp-meta.example,
bot.toml.whatsapp-twilio.example, and bot.toml.slack.example in .storkit/
during scaffolding.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the monolithic bot.toml.example with separate files for each
transport: matrix, whatsapp-meta, whatsapp-twilio, and slack. Add a
chat bot configuration section to the README explaining that only one
transport can be active at a time and how to set up each one.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
work/2_current/, work/3_qa/, work/4_merge/ are not committed to git
(only 1_backlog, 5_done, 6_archived are). New projects were missing
these entries in .storkit/.gitignore.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gVisor is incompatible with OrbStack bind mounts on macOS — writes to
/mnt/mac are blocked by the gVisor filesystem sandbox. Removing
runtime: runsc from docker-compose.yml, the gVisor setup docs from
README.md, and the runsc assertion test from rebuild.rs.
The existing Docker hardening (read-only root, cap_drop ALL,
no-new-privileges, resource limits) remains in place.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The bind-mounted node_modules from macOS contains darwin-arm64 native
binaries which don't work in the Linux container. Run npm install on
container startup to get the correct platform binaries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Commit e4227cf (a story creation auto-commit) erroneously deleted 175
files from master's tree, likely due to a race condition between
concurrent git operations. This commit re-adds all files from the
working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When git worktree remove fails with "not a working tree", fall back to
removing the directory directly and run git worktree prune to clean
stale metadata. Fixes bug 364.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.story_kit/ and .story_kit_port were stale references from before the
rename to storkit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When unset, Claude Code falls back to OAuth credentials from
`claude login`, allowing agents to run on a Max subscription
instead of prepaid API credits.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Docker named volumes inherit directory ownership when first created.
By creating /workspace/target and /app/target as storkit-owned before
the USER directive, the volumes will be writable by the storkit user.
Without this, cargo build/test/clippy all fail with Permission Denied.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts workarounds added by the 361 agent when the hardened Docker
container broke the test suite:
- gates.rs: restore tempfile::tempdir() (was changed to tempdir_in
CARGO_MANIFEST_DIR to avoid noexec /tmp; noexec is now removed)
- pool/mod.rs: restore ps -p <pid> check in process_is_running (was
changed to /proc/<pid> existence check; procps is now installed)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add procps to runtime stage so `ps` is available for process management
- Remove noexec from /tmp and /home/storkit tmpfs mounts so test scripts
can be executed from tempdir
- Update coder agent system_prompt to run clippy --all-targets --all-features
matching what the server acceptance gate actually runs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Acceptance gates run cargo clippy but the component wasn't installed
in the build stage. Agents were doing real work then failing every
gate check because clippy wasn't available.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The tmpfs at /home/storkit defaulted to root ownership (mode=755),
so Claude Code couldn't write ~/.claude.json or ~/.cache/. Set
uid=999,gid=999 to match the storkit user.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Claude Code writes to ~/.claude.json, ~/.cache/, and ~/.npm/ which
failed silently on the read-only root filesystem. Add tmpfs at
/home/storkit so the home dir is writable (the claude-state volume
overlays on top for persistent .claude/ data).
Also fix .dockerignore: use **/target/ to match nested target dirs,
add .storkit/logs/ and **/node_modules/ to prevent multi-GB build
context transfers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use the known cargo build output path instead of current_exe() when
re-execing after a rebuild. In Docker, the running binary lives at
/usr/local/bin/storkit (read-only) while cargo writes the new binary
to /app/target/release/storkit (a writable volume), so current_exe()
would just restart the old binary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run as non-root user (fixes Claude Code refusing bypassPermissions as
root, which caused all agent spawns to exit instantly with no session).
Add read-only root filesystem, drop all capabilities, set
no-new-privileges, bind port to localhost only, and require
GIT_USER_NAME/GIT_USER_EMAIL env vars at startup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bumps server/Cargo.toml and frontend/package.json to 0.4.1
- Release script now auto-bumps both version files when run
- Changelog generation matches both "storkit:" and "story-kit:" prefixes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The .story_kit → .storkit rename updated the grep pattern but all historical
merge commits still use the old "story-kit:" prefix, so overview could not
find any stories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Renames the config directory and updates 514 references across 42 Rust
source files, plus CLAUDE.md, .gitignore, Makefile, script/release,
and .mcp.json files. All 1205 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates -p flag in rebuild_and_restart, MCP server name, enabledMcpjsonServers,
and test values to match the new binary/crate name.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds htop bot command with live-updating Matrix message showing system
load and per-agent CPU/memory usage. Supports timeout override and
htop stop. Resolved conflict with git command in commands.rs registry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stories got stuck in QA/merge when agents were busy at assignment time.
Consolidates auto_assign into a single unconditional call at the end of
run_pipeline_advance, so whenever any agent completes, the system
immediately scans for pending work and assigns free agents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves status, ambient, and help commands into a unified command registry
in commands.rs. Help output now automatically lists all registered
commands. Resolved merge conflict with 1_backlog rename.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 283 was implemented with manual_qa defaulting to true, causing all
stories to hold in QA for human review. Changed to default false as
originally specified — stories advance automatically unless explicitly
opted in with manual_qa: true.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add permission rules to .claude/settings.json
- Document empty merge and direct-to-master problems in problems.md
- Fix agent stream URL to use vite proxy instead of hardcoded host
- Add /agents proxy config to vite.config.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change pkill pattern to 'target.*story-kit' to only match the Rust
binary, not any process with story-kit in its working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed the --system argument from the PTY runner — Claude Code CLI
doesn't support it. Bot name instruction is now prepended to the user
prompt instead of passed as a system prompt argument.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This returns the full tool catalog (create stories, spawn agents, record tests, manage worktrees, etc.). Familiarize yourself with the available tools before proceeding. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation.
2. **Read Context:** Check `.story_kit/specs/00_CONTEXT.md` for high-level project goals.
3. **Read Stack:** Check `.story_kit/specs/tech/STACK.md` for technical constraints and patterns.
4. **Check Work Items:** Look at `.story_kit/work/1_upcoming/` and `.story_kit/work/2_current/` to see what work is pending.
4. **Check Work Items:** Look at `.story_kit/work/1_backlog/` and `.story_kit/work/2_current/` to see what work is pending.
Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
@@ -87,7 +87,7 @@ Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
The server watches `.story_kit/work/` for changes. When a file is created, moved, or modified, the watcher auto-commits with a deterministic message and broadcasts a WebSocket notification to the frontend. This means:
* MCP tools only need to write/move files — the watcher handles git commits
* IDE drag-and-drop works (drag a story from `1_upcoming/` to `2_current/`)
* IDE drag-and-drop works (drag a story from `1_backlog/` to `2_current/`)
* The frontend updates automatically without manual refresh
---
@@ -156,7 +156,7 @@ Not everything needs to be a full story. Simple bugs can skip the story process:
* Performance issues with known fixes
### Bug Process
1. **Document Bug:** Create a bug file in `work/1_upcoming/` named `{id}_bug_{slug}.md` with:
1. **Document Bug:** Create a bug file in `work/1_backlog/` named `{id}_bug_{slug}.md` with:
***Symptom:** What the user observes
***Root Cause:** Technical explanation (if known)
***Reproduction Steps:** How to trigger the bug
@@ -186,7 +186,7 @@ Not everything needs a story or bug fix. Spikes are time-boxed investigations to
* Need to validate performance constraints
### Spike Process
1. **Document Spike:** Create a spike file in `work/1_upcoming/` named `{id}_spike_{slug}.md` with:
1. **Document Spike:** Create a spike file in `work/1_backlog/` named `{id}_spike_{slug}.md` with:
***Question:** What you need to answer
***Hypothesis:** What you expect to be true
***Timebox:** Strict limit for the research
@@ -209,7 +209,7 @@ When the LLM context window fills up (or the chat gets slow/confused):
1. **Stop Coding.**
2. **Instruction:** Tell the user to open a new chat.
3. **Handoff:** The only context the new LLM needs is in the `specs/` folder and `.mcp.json`.
* *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_upcoming/` and `work/2_current/` to see what is pending."
* *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_backlog/` and `work/2_current/` to see what is pending."
---
@@ -221,14 +221,36 @@ If a user hands you this document and says "Apply this process to my project":
1. **Check for MCP Tools:** Look for `.mcp.json` in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities.
2. **Analyze the Request:** Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?").
3. **Git Check:** Check if the directory is a git repository (`git status`). If not, run `git init`.
4. **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_upcoming/` through `work/6_archived/`).
4. **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_backlog/` through `work/6_archived/`).
5. **Draft Context:** Write `specs/00_CONTEXT.md` based on the user's answer.
6. **Draft Stack:** Write `specs/tech/STACK.md` based on best practices for that language.
7. **Wait:** Ask the user for "Story #1".
---
## 6. Code Quality
## 6. Chat Bot Configuration
Story Kit includes a chat bot that can be connected to one messaging platform at a time. The bot handles commands, LLM conversations, and pipeline notifications.
**Only one transport can be active at a time.** To configure the bot, copy the appropriate example file to `.storkit/bot.toml`:
The `bot.toml` file is gitignored (it contains secrets). The example files are checked in for reference.
---
## 7. Code Quality
**MANDATORY:** Before completing Step 3 (Verification) of any story, you MUST run all applicable linters, formatters, and test suites and fix ALL errors and warnings. Zero tolerance for warnings or errors.
Recurring issues observed during pipeline operation. Review periodically and create stories for systemic problems.
## 2026-03-18: Stories graduating to "done" with empty merges (7 of 10)
Pipeline allows stories to move through coding → QA → merge → done without any actual code changes landing on master. The squash-merge produces an empty diff but the pipeline still marks the story as done. Affected stories: 247, 273, 274, 278, 279, 280, 92. Only 266, 271, 277, and 281 actually shipped code. Root cause: no check that the merge commit contains a non-empty diff. Filed bug 283 for the manual_qa gate issue specifically, but the empty-merge-to-done problem is broader and needs its own fix.
## 2026-03-18: Agent committed directly to master instead of worktree
Multiple agents have committed directly to master instead of their worktree/feature branch:
- Commit `5f4591f` ("fix: update should_commit_stage test to match 5_done") — likely mergemaster
- Commit `a32cfbd` ("Add bot-level command registry with help command") — story 285 coder committed code + Cargo.lock directly to master
Agents should only commit to their feature branch or merge-queue branch, never to master directly. Suspect agents are running `git commit` in the project root instead of the worktree directory. This can also revert uncommitted fixes on master (e.g. project.toml pkill fix was overwritten). Frequency: at least 2 confirmed cases. This is a recurring and serious problem — needs a guard in the server or agent prompts.
## 2026-03-19: Auto-assign re-assigns mergemaster to failed merge stories in a loop
After bug 295 fix (`auto_assign_available_work` after every pipeline advance), mergemaster gets re-assigned to stories that already have a merge failure flag. Story 310 had an empty diff merge failure — mergemaster correctly reported the failure, but auto-assign immediately re-assigned mergemaster to the same story, creating an infinite retry loop. The auto-assign logic needs to check for the `merge_failure` front matter flag before re-assigning agents to stories in `4_merge/`.
## 2026-03-19: Coder produces no code (complete ghost — story 310)
Story 310 (Bot delete command) went through the full pipeline — coder session ran, passed QA/gates, moved to merge — but the coder produced zero code. No commits on the feature branch, no commits on master. The entire agent session was a no-op. This is different from the "committed to master instead of worktree" problem — in this case, the coder simply did nothing. Need to investigate the coder logs to understand what happened. The empty-diff merge check would catch this at merge time, but ideally the server should detect "coder finished with no commits on feature branch" at the gate-check stage and fail early.
## 2026-03-19: Auto-assign assigns mergemaster to coding-stage stories
Auto-assign picked mergemaster for story 310 which was in `2_current/`. Mergemaster should only work on stories in `4_merge/`. The `auto_assign_available_work` function doesn't enforce that the agent's configured stage matches the pipeline stage of the story it's being assigned to. Story 279 (auto-assign respects agent stage from front matter) was supposed to fix this, but the check may only apply to front-matter preferences, not the fallback assignment path.
system_prompt="You are a supervisor agent. Read CLAUDE.md and .story_kit/README.md first to understand the project dev process. Use MCP tools to coordinate sub-agents. Never implement code directly - always delegate to coder agents and monitor their progress. Use wait_for_agent to block until the coder finishes — the server automatically runs acceptance gates when the agent process exits. Never accept stories or merge to master - get all gates green and report to the human."
[[agent]]
name="coder-1"
stage="coder"
@@ -57,7 +38,7 @@ model = "sonnet"
max_turns=50
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
name="coder-2"
@@ -67,7 +48,17 @@ model = "sonnet"
max_turns=50
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
name="coder-3"
stage="coder"
role="Full-stack engineer. Implements features across all components."
model="sonnet"
max_turns=50
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
name="qa-2"
@@ -102,7 +93,7 @@ Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
system_prompt="You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy --all-targets --all-features and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
name="qa"
@@ -179,7 +170,7 @@ Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
@@ -118,8 +118,8 @@ To support both Remote and Local models, the system implements a `ModelProvider`
Multiple instances can run simultaneously in different worktrees. To avoid port conflicts:
- **Backend:** Set `STORYKIT_PORT` to a unique port (default is 3001). Example: `STORYKIT_PORT=3002 cargo run`
- **Frontend:** Run `npm run dev` from `frontend/`. It auto-selects the next unused port. It reads `STORYKIT_PORT` to know which backend to talk to, so export it before running: `export STORYKIT_PORT=3002 && cd frontend && npm run dev`
- **Backend:** Set `STORKIT_PORT` to a unique port (default is 3001). Example: `STORKIT_PORT=3002 cargo run`
- **Frontend:** Run `npm run dev` from `frontend/`. It auto-selects the next unused port. It reads `STORKIT_PORT` to know which backend to talk to, so export it before running: `export STORKIT_PORT=3002 && cd frontend && npm run dev`
When running in a worktree, use a port that won't conflict with the main instance (3001). Ports 3002+ are good choices.
# Story 388: WhatsApp webhook HMAC signature verification
## User Story
As a bot operator, I want incoming WhatsApp webhook requests to be cryptographically verified, so that forged requests from unauthorized sources are rejected.
## Acceptance Criteria
- [ ] Meta webhooks: validate X-Hub-Signature-256 HMAC-SHA256 header using the app secret before processing
- [ ] Twilio webhooks: validate request signature using the auth token before processing
- [ ] Requests with missing or invalid signatures are rejected with 403 Forbidden
- [ ] Verification is fail-closed: if signature checking is configured, unsigned requests are rejected
- [ ] Existing bot.toml config is extended with any needed secrets (e.g. Meta app_secret for HMAC verification)
- [ ] MUST use audited crypto crates (hmac, sha2, sha1, base64) — no hand-rolled cryptographic primitives
name: "Fly.io Machines API integration for multi-tenant storkit SaaS"
---
# Spike 408: Fly.io Machines API integration for multi-tenant storkit SaaS
## Question
Can we build a working Rust integration that creates and manages per-tenant Fly.io Machines, attaches volumes, injects Claude credentials, and proxies JWT-authenticated HTTP/WebSocket traffic to the right machine?
## Hypothesis
A thin Rust service using `reqwest` for the Machines API and `axum` for the reverse proxy is sufficient. No heavyweight orchestration framework needed.
## Prerequisites
- Fly.io account with API token (set `FLY_API_TOKEN` env var)
- Spike 407 findings reviewed
## Timebox
4 hours
## Investigation Plan
- [ ] Create a minimal Rust crate in `spikes/fly_machines/` — do not touch production code
- [ ] Implement machine lifecycle: create, start, stop, destroy via Fly Machines REST API using `reqwest`
- [ ] Test attaching a persistent volume to a machine and verify it persists across stop/start
- [ ] Test secret injection — pass a dummy `credentials.json` as a Fly secret and verify it's readable inside the machine
- [ ] Sketch the auth proxy: JWT validation → machine lookup → reverse proxy to machine's private IP; verify WebSocket proxying works
- [ ] Measure actual cold start time for a minimal storkit container image
- [ ] Document any API quirks, rate limits, or sharp edges discovered during testing
name: "Multi-account OAuth token rotation on rate limit"
---
# Story 411: Multi-account OAuth token rotation on rate limit
## User Story
As a storkit user with multiple Claude Max subscriptions, I want the system to automatically rotate to a different account when one gets rate limited, so that agents and chat don't stall out waiting for limits to reset.
## Acceptance Criteria
- [ ] OAuth login flow stores credentials per-account (keyed by email), not overwriting previous accounts
- [ ] GET /oauth/status returns all stored accounts and their status (active, rate-limited, expired)
- [ ] When the active account hits a rate limit, storkit automatically swaps to the next available account's refresh token, refreshes, and retries
- [ ] The bot sends a notification in Matrix/WhatsApp when it swaps accounts
- [ ] If all accounts are rate limited, the bot surfaces a clear message with the time until the earliest reset
- [ ] A new /oauth/authorize login adds to the account pool rather than replacing the current credentials
name: "Recheck bot command to re-run gates without restarting agent"
---
# Story 412: Recheck bot command to re-run gates without restarting agent
## User Story
As a user, I want to send `recheck <number>` to the bot so that it re-runs acceptance gates on an existing worktree without spawning a new agent, so I can unblock stories that failed due to environment issues without wasting agent turns.
## Acceptance Criteria
- [ ] recheck command is registered in chat/commands/mod.rs and appears in help output
- [ ]`recheck <number>` runs run_acceptance_gates on the story's existing worktree
- [ ] If gates pass, the story advances through the pipeline (same as if a coder completed successfully)
- [ ] If gates fail, the error output is returned to the user (not silently retried)
- [ ] If no worktree exists for the story, returns a clear error
- [ ] Does not spawn a new agent or increment retry_count
- [ ] Works from all transports (Matrix, WhatsApp, Slack)
name: "Interactive project setup wizard for new storkit projects"
agent: coder-opus
---
# Story 429: Interactive project setup wizard for new storkit projects
## User Story
As a developer adopting storkit on an existing project, I want a guided setup process that scaffolds the .storkit directory and has an agent generate project-specific configuration files, so that I can get up and running without manually writing specs and scripts.
## Acceptance Criteria
- [ ] storkit init scaffolds .storkit/ directory structure, project.toml, and .mcp.json without clobbering any existing files (especially CLAUDE.md)
- [ ] Setup wizard tracks progress through ordered steps, resumable if interrupted
# Refactor 418: Split pool/auto_assign.rs into submodules
## Current State
- TBD
## Desired State
Refactor the monolithic server/src/agents/pool/auto_assign.rs (1813 lines) into focused submodules.
## Acceptance Criteria
- [ ] auto_assign.rs contains auto_assign_available_work and its unit tests
- [ ] reconcile.rs contains reconcile_on_startup and its unit tests
- [ ] watchdog.rs contains run_watchdog_once, spawn_watchdog, check_orphaned_agents and their unit tests
- [ ] scan.rs contains scan_stage_items, is_story_assigned_for_stage, count_active_agents_for_stage, find_free_agent_for_stage, is_agent_free and their unit tests
- [ ] story_checks.rs contains read_story_front_matter_agent, has_review_hold, is_story_blocked, has_merge_failure and their unit tests
- [ ] mod.rs wires the submodules and re-exports all public items
- [ ] Unit tests live in their respective module files
- [ ] No public API changes — all existing imports continue to work
name: "Matrix bot crashes on transient network error instead of retrying"
---
# Bug 419: Matrix bot crashes on transient network error instead of retrying
## Description
The Matrix bot treats a transient sync error as fatal and stops entirely. A single failed HTTP request to the homeserver kills the bot, requiring a full server rebuild to recover.
## How to Reproduce
1. Run storkit with Matrix bot enabled\n2. Homeserver becomes temporarily unreachable (network blip, DNS hiccup, server restart)\n3. Bot hits sync error and crashes
## Actual Result
Bot logs "Fatal error: Matrix sync error: error sending request for url (...)" and stops responding. No retry, no recovery.
## Expected Result
Bot logs a warning, backs off with exponential delay, and retries the sync. Only crash on unrecoverable errors (invalid credentials, banned, etc).
## Acceptance Criteria
- [ ] Transient network errors (connection refused, timeout, DNS failure) trigger a retry with exponential backoff
- [ ] Bot logs a warning on each failed retry attempt
- [ ] Bot resumes normal operation once the homeserver is reachable again
- [ ] Unrecoverable errors (401, 403) still cause a clean shutdown with a clear error message
- [ ] Bot sends a notification after recovering from a network outage
name: "loc for a specified file — bot command and web UI slash command"
---
# Story 420: loc for a specified file — bot command and web UI slash command
## User Story
As a developer, I want to send `loc <filepath>` to the bot or use it as a slash command in the web UI to see the line count for a specific file, so I can quickly check how large a file is without leaving my workflow.
## Acceptance Criteria
- [ ] loc <filepath> returns the line count for the specified file
- [ ] Relative paths are resolved against the project root
- [ ] If the file does not exist, returns a clear error
- [ ] Works from all transports (Matrix, WhatsApp, Slack)
- [ ] Works as a slash command in the web UI
- [ ] loc with no argument retains existing behavior (top files by line count)
- [ ] Exposed as an MCP tool so agents can query file line counts programmatically
# Story 421: Timer command for deferred agent start
## User Story
As a ..., I want ..., so that ...
## Acceptance Criteria
- [ ] Bot command `timer <story_id> <HH:MM>` schedules a one-shot deferred start for the given story at the next occurrence of that time (server-local timezone)
- [ ] Bot command `timer list` shows all pending timers with story ID and scheduled time
- [ ] Bot command `timer cancel <story_id>` removes the pending timer for that story
- [ ] Timers are persisted to .storkit/timers.json so they survive server restarts
- [ ] A 30s tick loop (tokio task, same pattern as watchdog) checks for due timers and calls start_agent when triggered
- [ ] When a timer fires, the story must already be in current — timer does not move stories between stages
- [ ] Fired timers are removed after execution (one-shot, not recurring)
- [ ] Multiple timers for the same time are supported and respect agent slot contention via auto-assign
- [ ] Pipeline status messages show a coloured dot next to each work item: green for running normally, yellow for throttled, red for hard blocked, white/grey for idle/no agent
- [ ] Hard block events (429 / rate_limit_exceeded) still send an individual chat notification with a red icon, including the reset time
- [ ] Throttle and block state tracked per-agent so the status dot updates in real time
- [ ] Server-side logging of throttle warnings is preserved for debugging
- [ ] Traffic light dots in status report should be small/compact, not large emoji
name: "Chat notification when a story blocks with reason"
---
# Story 425: Chat notification when a story blocks with reason
## User Story
As a project owner monitoring agent progress via chat, I want to receive a notification when a story gets blocked, including the reason, so that I can decide whether to unblock it or investigate the failure.
## Acceptance Criteria
- [ ] When a story transitions to blocked state, send a chat notification to all configured transports
- [ ] Notification includes the story ID, story name, and the reason for blocking (e.g. gate failure output, max retries exceeded, empty diff)
- [ ] Notification uses a red or warning icon to distinguish from normal status messages
- [ ] Works across Matrix, WhatsApp, and Slack transports
name: "Mergemaster pipeline marks story done without verifying code landed on master"
retry_count: 1
---
# Bug 426: Mergemaster pipeline marks story done without verifying code landed on master
## Description
The mergemaster pipeline can mark a story as done even when the feature code never makes it to master. The cherry-pick step in merge.rs may fail or be skipped, but the pipeline still advances the story to done via the filesystem watcher. There is no post-merge verification that the code actually exists on master before marking done.
## How to Reproduce
Observed on stories 422 and 403. For 422: mergemaster created merge-queue branch, resolved 2 conflicts in chat/commands/mod.rs and http/mcp/mod.rs, passed quality gates, created merge-queue commit cb2ef6b (4 files, 333 insertions including unblock.rs). But the done commit on master (05db012) only moves the story file — zero code changes. There is no 'storkit: merge 422' commit on master at all. The feature branch (db3157f) still has the code but it was never cherry-picked onto master.
## Manual Merge Notes
When manually cherry-picking 422 onto master, two conflicts arose:
1. `server/src/chat/commands/mod.rs` — both 421 (timer) and 422 (unblock) added entries to the same BotCommand registry. Resolution: keep both.
2. `server/src/http/mcp/mod.rs` — 420 (loc_file) and 422 (unblock) both bumped the tool count assertion from 49→50. Resolution: keep loc_file assertion, bump count to 51.
Additionally, the cherry-pick could not proceed at all because master was on the `merge-queue/424` branch with 3 unresolved files (notifications.rs, ws.rs, watcher.rs). A concurrent in-progress merge left the working tree dirty, which likely caused the original cherry-pick to fail silently. This suggests a race condition: the filesystem watcher commits (story file moves) can leave master in a state where the cherry-pick step in merge.rs fails.
## Full Audit of Done Stories (2026-03-28)
Audited all 9 stories in `5_done/` to check whether their code actually landed on master:
| **424 — Rate limit traffic light** | **None** | **NO — moved back to backlog for redo** |
| 425 — Chat notification on story block | `98b5475` (5 files, +184/-15) | YES |
| **427 — Text normalization for line breaks** | **None** | **NO — phantom done, code never landed** |
**4 out of 10 stories (422, 423, 424, 427) had broken merges.** 422 and 423 were fixed via manual cherry-pick. 424 was moved back to backlog for a fresh run. 427 also hit the same bug — marked done without code on master.
## Actual Result
Story moved to done with no code on master. The merge-queue commit exists on a detached branch but was never applied to master. No merge commit appears in git log on master.
## Expected Result
Pipeline should verify that the cherry-pick produced a merge commit on master before advancing to done. If cherry-pick fails or is missing, the story should remain in merge stage with a merge_failure flag.
## Suggested Fix
The code path is: `merge.rs::run_squash_merge` → `pipeline/merge.rs::start_merge_agent_work` → `lifecycle.rs::move_story_to_archived`.
`run_squash_merge` (merge.rs:354) cherry-picks the merge-queue commit onto `project_root` and checks `cp.status.success()`. If it returns `success: true`, `start_merge_agent_work` (pipeline/merge.rs:106) immediately calls `move_story_to_archived`, which moves the story file to `5_done/`. The watcher then commits "storkit: done".
The gap: between the cherry-pick returning success and the story moving to done, nobody verifies the cherry-pick actually produced a code commit on master. Possible failure modes:
1. `project_root` is not on master (e.g. checked out to a merge-queue branch from a concurrent merge)
2. Cherry-pick exits 0 but produces an empty commit (no code diff)
3. Cherry-pick succeeds on the wrong branch
**Fix:** After the cherry-pick in `run_squash_merge` succeeds (line 384), before returning `success: true`:
1. Verify `project_root` is on master: `git rev-parse --abbrev-ref HEAD` must equal the base branch
2. Verify the HEAD commit on master contains the expected merge message (e.g. matches `storkit: merge <story_id>`) or has a non-empty diff
3. If either check fails, abort the cherry-pick and return `success: false`
This keeps the fix entirely within `run_squash_merge` — no changes needed to the pipeline advance or lifecycle code.
## Acceptance Criteria
- [ ] Pipeline must not move a story to done unless a merge commit containing the feature code exists on master
- [ ] If cherry-pick fails or produces no code diff on master, the merge must be reported as failed
- [ ] Add a post-merge verification step that checks git log on master for the expected merge commit before advancing to done
- [ ] When verification fails, emit a merge_failure and leave the story in the merge stage for retry
name: "Server-side text normalization for chat message line breaks"
---
# Story 427: Server-side text normalization for chat message line breaks
## User Story
As a user reading bot messages in Matrix, I want single newlines between sentences to render correctly, so that messages don't show up with words joined together like "sentence one.Sentence two".
## Acceptance Criteria
- [ ] Add a text normalization step before markdown-to-HTML conversion in the Matrix transport that converts single newlines between non-empty prose lines into double newlines
- [ ] Preserve intentional single-newline formatting in bullet lists, headings, table rows, and code fences
- [ ] Apply the same normalization in WhatsApp and Slack transports
- [ ] Unit tests covering prose paragraphs, bullet lists, code blocks, and mixed content
@@ -10,7 +10,7 @@ The `prompt_permission` MCP tool returns plain text ("Permission granted for '..
## How to Reproduce
1. Start the story-kit server and open the web UI
1. Start the storkit server and open the web UI
2. Chat with the claude-code-pty model
3. Ask it to do something that requires a tool NOT in `.claude/settings.json` allow list (e.g. `wc -l /etc/hosts`, or WebFetch to a non-allowed domain)
@@ -6,7 +6,7 @@ name: "Retry limit for mergemaster and pipeline restarts"
## User Story
As a developer using story-kit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
As a developer using storkit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.