- Bumps server/Cargo.toml and frontend/package.json to 0.4.1
- Release script now auto-bumps both version files when run
- Changelog generation matches both "storkit:" and "story-kit:" prefixes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The .story_kit → .storkit rename updated the grep pattern but all historical
merge commits still use the old "story-kit:" prefix, so overview could not
find any stories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Renames the config directory and updates 514 references across 42 Rust
source files, plus CLAUDE.md, .gitignore, Makefile, script/release,
and .mcp.json files. All 1205 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates -p flag in rebuild_and_restart, MCP server name, enabledMcpjsonServers,
and test values to match the new binary/crate name.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds htop bot command with live-updating Matrix message showing system
load and per-agent CPU/memory usage. Supports timeout override and
htop stop. Resolved conflict with git command in commands.rs registry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stories got stuck in QA/merge when agents were busy at assignment time.
Consolidates auto_assign into a single unconditional call at the end of
run_pipeline_advance, so whenever any agent completes, the system
immediately scans for pending work and assigns free agents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves status, ambient, and help commands into a unified command registry
in commands.rs. Help output now automatically lists all registered
commands. Resolved merge conflict with 1_backlog rename.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug 283 was implemented with manual_qa defaulting to true, causing all
stories to hold in QA for human review. Changed to default false as
originally specified — stories advance automatically unless explicitly
opted in with manual_qa: true.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add permission rules to .claude/settings.json
- Document empty merge and direct-to-master problems in problems.md
- Fix agent stream URL to use vite proxy instead of hardcoded host
- Add /agents proxy config to vite.config.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change pkill pattern to 'target.*story-kit' to only match the Rust
binary, not any process with story-kit in its working directory.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed the --system argument from the PTY runner — Claude Code CLI
doesn't support it. Bot name instruction is now prepended to the user
prompt instead of passed as a system prompt argument.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cast lastSendChatArgs through unknown to satisfy TypeScript's
strict type narrowing on the nullable union type.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds system_prompt parameter to chat_stream so the Matrix bot
passes "Your name is {name}" to Claude Code. Reads display_name
from bot.toml config. Resolved conflicts by integrating bot_name
into master's permission-handling code structure.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The history trimming logic cleared session_id = None whenever entries
exceeded history_size. Since the history was always at capacity, every
new message wiped the session_id immediately after storing it. This
meant --resume was never passed to Claude Code, making every turn a
fresh conversation.
Fix: preserve session_id on trim. Claude Code's --resume loads the
full conversation from its own session transcript on disk, so trimming
our local tracking entries doesn't invalidate the session.
Also adds debug logging for session_id capture/storage (temporary).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test was asserting 5_done should NOT trigger commits, but commit 74dc42c
added 5_done to COMMIT_WORTHY_STAGES. Updated test to match.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Upgrade mergemaster prompt to resolve complex conflicts itself
instead of just reporting failure. Includes instructions to check
git history and story files for context before resolving.
- Add proxy error handler to vite config to prevent crashes on
backend ECONNREFUSED.
- Fix bug 279: auto-assign now checks that preferred agent's stage
matches the pipeline stage. Coders won't be assigned to QA/merge.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds assigned agent display to the expanded work item detail panel.
Resolved conflicts by keeping master versions of bot.rs (permission
handling), ChatInput.tsx, and fs.rs. Removed duplicate list_project_files
endpoint and tests from io.rs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mergemaster's auto-resolver inserted a duplicate chat_stream call
inside the tokio::select! permission loop, producing mismatched braces.
The 266 merge didn't contribute useful code changes to bot.rs — the
session_id handling was already present.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ignore 2_current/, 3_qa/, and 4_merge/ in git since spike 92 stopped
committing intermediate pipeline moves. Keep 5_done/ tracked so
accepted work is in git history before sweep to archived.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Filter flush_pending() to only git-commit for terminal stages
(1_upcoming and 6_archived) while still broadcasting WatcherEvents
for all stages so the frontend stays in sync.
Reduces pipeline commits from 5+ to 2 per story run. No system
dependencies on intermediate commits were found.
Preserves merge_failure front matter cleanup from master.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
script/test unconditionally cd'd into frontend/ and ran npm test, which
failed in sparse-checkout worktrees that only contain the server/ subtree.
Guard the frontend step with a directory existence check so acceptance
gates pass for worktrees that have no frontend checkout.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Vite only needs to watch frontend/ sources. Ignore git objects,
Rust source, Cargo files, node_modules, and vendor directories
that change in bulk during squash merges.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The server writes .mcp.json with its current port on startup and
during worktree creation. Tracking it in git causes QA gate failures
when the worktree copy diverges from master (e.g. different port).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generate structured changelogs from completed stories instead of raw
commit messages. Group by features, bug fixes, and refactors. Filter
out story-kit automation commits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
TypeScript control flow analysis can't track reassignment inside vi.mock
callbacks, causing lastSendChatArgs to narrow to never. Use non-null
assertions after the explicit toBeNull() guard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The homeserver requires User-Interactive Authentication before accepting
cross-signing keys. Pass the bot's password so bootstrap succeeds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_command_with_timeout piped stdout/stderr but only read them after
the child exited. When test output exceeded the 64KB OS pipe buffer,
the child blocked on write() while the parent blocked on waitpid() —
a permanent deadlock that caused every merge pipeline to hang.
Drain both pipes in background threads so the buffers never fill.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The merge pipeline (squash merge + quality gates) takes well over 60
seconds. Claude Code's MCP HTTP transport times out at 60s, causing
"completed with no output" — the mergemaster retries fruitlessly.
merge_agent_work now starts the pipeline as a background task and
returns immediately. A new get_merge_status tool lets the mergemaster
poll until the job reaches a terminal state. Also adds a double-start
guard so concurrent calls for the same story are rejected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test commands in run_project_tests now use wait-timeout to enforce a
600-second ceiling, preventing hung processes (e.g. Playwright with no
server) from blocking the merge pipeline indefinitely. Also disables
e2e tests in script/test until the merge workspace can run them safely.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an optional `agent:` field to story file front matter so that a
specific agent can be requested for a story. The auto-assign loop now:
1. Reads the front-matter `agent` field for each story before picking
a free agent.
2. If a preferred agent is named, uses it when free; skips the story
(without falling back) when that agent is busy.
3. Falls back to the existing `find_free_agent_for_stage` behaviour
when no preference is specified.
Ported from feature branch that predated the agents.rs module refactoring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split the monolithic agents.rs into 6 focused modules:
- mod.rs: shared types (AgentEvent, AgentStatus, etc.) and re-exports
- pool.rs: AgentPool struct, all methods, and helper free functions
- pty.rs: PTY streaming (run_agent_pty_blocking, emit_event)
- lifecycle.rs: story movement functions (move_story_to_qa, etc.)
- gates.rs: acceptance gates (clippy, tests, coverage)
- merge.rs: squash-merge, conflict resolution, quality gates
All 121 original tests are preserved and distributed across modules.
Also adds clear_front_matter_field to story_metadata.rs to strip
stale merge_failure from front matter when stories move to done.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "Bug Workflow: Root Cause First" guidance to all coder agent prompts
and system prompts. Adds a test ensuring all coder-stage agents include
root cause, git bisect/log, and anti-workaround instructions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Commit untracked work pipeline files (stories, bugs in various stages)
and package-lock.json that were present on the filesystem but not yet
tracked by git.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Correct lastScrollTopRef assignment in scrollToBottom to read back the
browser-capped scrollTop value instead of using scrollHeight, which was
causing handleScroll to incorrectly detect upward scrolling and disable
auto-scroll.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the require_verified_devices config toggle. The bot now always requires
encrypted rooms and cross-signing-verified devices before executing any command.
Messages from unencrypted rooms or unverified devices are rejected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement /btw side question slash command — lets users ask quick
questions from conversation context without disrupting the main chat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This returns the full tool catalog (create stories, spawn agents, record tests, manage worktrees, etc.). Familiarize yourself with the available tools before proceeding. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation.
This returns the full tool catalog (create stories, spawn agents, record tests, manage worktrees, etc.). Familiarize yourself with the available tools before proceeding. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation.
2. **Read Context:** Check `.story_kit/specs/00_CONTEXT.md` for high-level project goals.
2. **Read Context:** Check `.story_kit/specs/00_CONTEXT.md` for high-level project goals.
3. **Read Stack:** Check `.story_kit/specs/tech/STACK.md` for technical constraints and patterns.
3. **Read Stack:** Check `.story_kit/specs/tech/STACK.md` for technical constraints and patterns.
4. **Check Work Items:** Look at `.story_kit/work/1_upcoming/` and `.story_kit/work/2_current/` to see what work is pending.
4. **Check Work Items:** Look at `.story_kit/work/1_backlog/` and `.story_kit/work/2_current/` to see what work is pending.
Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
@@ -87,7 +87,7 @@ Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server.
The server watches `.story_kit/work/` for changes. When a file is created, moved, or modified, the watcher auto-commits with a deterministic message and broadcasts a WebSocket notification to the frontend. This means:
The server watches `.story_kit/work/` for changes. When a file is created, moved, or modified, the watcher auto-commits with a deterministic message and broadcasts a WebSocket notification to the frontend. This means:
* MCP tools only need to write/move files — the watcher handles git commits
* MCP tools only need to write/move files — the watcher handles git commits
* IDE drag-and-drop works (drag a story from `1_upcoming/` to `2_current/`)
* IDE drag-and-drop works (drag a story from `1_backlog/` to `2_current/`)
* The frontend updates automatically without manual refresh
* The frontend updates automatically without manual refresh
---
---
@@ -156,7 +156,7 @@ Not everything needs to be a full story. Simple bugs can skip the story process:
* Performance issues with known fixes
* Performance issues with known fixes
### Bug Process
### Bug Process
1. **Document Bug:** Create a bug file in `work/1_upcoming/` named `{id}_bug_{slug}.md` with:
1. **Document Bug:** Create a bug file in `work/1_backlog/` named `{id}_bug_{slug}.md` with:
***Symptom:** What the user observes
***Symptom:** What the user observes
***Root Cause:** Technical explanation (if known)
***Root Cause:** Technical explanation (if known)
***Reproduction Steps:** How to trigger the bug
***Reproduction Steps:** How to trigger the bug
@@ -186,7 +186,7 @@ Not everything needs a story or bug fix. Spikes are time-boxed investigations to
* Need to validate performance constraints
* Need to validate performance constraints
### Spike Process
### Spike Process
1. **Document Spike:** Create a spike file in `work/1_upcoming/` named `{id}_spike_{slug}.md` with:
1. **Document Spike:** Create a spike file in `work/1_backlog/` named `{id}_spike_{slug}.md` with:
***Question:** What you need to answer
***Question:** What you need to answer
***Hypothesis:** What you expect to be true
***Hypothesis:** What you expect to be true
***Timebox:** Strict limit for the research
***Timebox:** Strict limit for the research
@@ -209,7 +209,7 @@ When the LLM context window fills up (or the chat gets slow/confused):
1. **Stop Coding.**
1. **Stop Coding.**
2. **Instruction:** Tell the user to open a new chat.
2. **Instruction:** Tell the user to open a new chat.
3. **Handoff:** The only context the new LLM needs is in the `specs/` folder and `.mcp.json`.
3. **Handoff:** The only context the new LLM needs is in the `specs/` folder and `.mcp.json`.
* *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_upcoming/` and `work/2_current/` to see what is pending."
* *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_backlog/` and `work/2_current/` to see what is pending."
---
---
@@ -221,7 +221,7 @@ If a user hands you this document and says "Apply this process to my project":
1. **Check for MCP Tools:** Look for `.mcp.json` in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities.
1. **Check for MCP Tools:** Look for `.mcp.json` in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities.
2. **Analyze the Request:** Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?").
2. **Analyze the Request:** Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?").
3. **Git Check:** Check if the directory is a git repository (`git status`). If not, run `git init`.
3. **Git Check:** Check if the directory is a git repository (`git status`). If not, run `git init`.
4. **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_upcoming/` through `work/6_archived/`).
4. **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_backlog/` through `work/6_archived/`).
5. **Draft Context:** Write `specs/00_CONTEXT.md` based on the user's answer.
5. **Draft Context:** Write `specs/00_CONTEXT.md` based on the user's answer.
6. **Draft Stack:** Write `specs/tech/STACK.md` based on best practices for that language.
6. **Draft Stack:** Write `specs/tech/STACK.md` based on best practices for that language.
Recurring issues observed during pipeline operation. Review periodically and create stories for systemic problems.
## 2026-03-18: Stories graduating to "done" with empty merges (7 of 10)
Pipeline allows stories to move through coding → QA → merge → done without any actual code changes landing on master. The squash-merge produces an empty diff but the pipeline still marks the story as done. Affected stories: 247, 273, 274, 278, 279, 280, 92. Only 266, 271, 277, and 281 actually shipped code. Root cause: no check that the merge commit contains a non-empty diff. Filed bug 283 for the manual_qa gate issue specifically, but the empty-merge-to-done problem is broader and needs its own fix.
## 2026-03-18: Agent committed directly to master instead of worktree
Multiple agents have committed directly to master instead of their worktree/feature branch:
- Commit `5f4591f` ("fix: update should_commit_stage test to match 5_done") — likely mergemaster
- Commit `a32cfbd` ("Add bot-level command registry with help command") — story 285 coder committed code + Cargo.lock directly to master
Agents should only commit to their feature branch or merge-queue branch, never to master directly. Suspect agents are running `git commit` in the project root instead of the worktree directory. This can also revert uncommitted fixes on master (e.g. project.toml pkill fix was overwritten). Frequency: at least 2 confirmed cases. This is a recurring and serious problem — needs a guard in the server or agent prompts.
## 2026-03-19: Auto-assign re-assigns mergemaster to failed merge stories in a loop
After bug 295 fix (`auto_assign_available_work` after every pipeline advance), mergemaster gets re-assigned to stories that already have a merge failure flag. Story 310 had an empty diff merge failure — mergemaster correctly reported the failure, but auto-assign immediately re-assigned mergemaster to the same story, creating an infinite retry loop. The auto-assign logic needs to check for the `merge_failure` front matter flag before re-assigning agents to stories in `4_merge/`.
## 2026-03-19: Coder produces no code (complete ghost — story 310)
Story 310 (Bot delete command) went through the full pipeline — coder session ran, passed QA/gates, moved to merge — but the coder produced zero code. No commits on the feature branch, no commits on master. The entire agent session was a no-op. This is different from the "committed to master instead of worktree" problem — in this case, the coder simply did nothing. Need to investigate the coder logs to understand what happened. The empty-diff merge check would catch this at merge time, but ideally the server should detect "coder finished with no commits on feature branch" at the gate-check stage and fail early.
## 2026-03-19: Auto-assign assigns mergemaster to coding-stage stories
Auto-assign picked mergemaster for story 310 which was in `2_current/`. Mergemaster should only work on stories in `4_merge/`. The `auto_assign_available_work` function doesn't enforce that the agent's configured stage matches the pipeline stage of the story it's being assigned to. Story 279 (auto-assign respects agent stage from front matter) was supposed to fix this, but the check may only apply to front-matter preferences, not the fallback assignment path.
system_prompt="You are a supervisor agent. Read CLAUDE.md and .story_kit/README.md first to understand the project dev process. Use MCP tools to coordinate sub-agents. Never implement code directly - always delegate to coder agents and monitor their progress. Use wait_for_agent to block until the coder finishes — the server automatically runs acceptance gates when the agent process exits. Never accept stories or merge to master - get all gates green and report to the human."
[[agent]]
[[agent]]
name="coder-1"
name="coder-1"
stage="coder"
stage="coder"
@@ -56,8 +32,8 @@ role = "Full-stack engineer. Implements features across all components."
model="sonnet"
model="sonnet"
max_turns=50
max_turns=50
max_budget_usd=5.00
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results."
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
[[agent]]
name="coder-2"
name="coder-2"
@@ -66,8 +42,18 @@ role = "Full-stack engineer. Implements features across all components."
model="sonnet"
model="sonnet"
max_turns=50
max_turns=50
max_budget_usd=5.00
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results."
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
name="coder-3"
stage="coder"
role="Full-stack engineer. Implements features across all components."
model="sonnet"
max_turns=50
max_budget_usd=5.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a full-stack engineer working autonomously in a git worktree. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
[[agent]]
name="qa-2"
name="qa-2"
@@ -87,12 +73,12 @@ Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
@@ -143,8 +129,8 @@ role = "Senior full-stack engineer for complex tasks. Implements features across
model="opus"
model="opus"
max_turns=80
max_turns=80
max_budget_usd=20.00
max_budget_usd=20.00
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results."
prompt="You are working in a git worktree on story {{story_id}}. Read CLAUDE.md first, then .story_kit/README.md to understand the dev process. The story details are in your prompt above. Follow the SDTW process through implementation and verification (Steps 1-3). The worktree and feature branch already exist - do not create them. Check .mcp.json for MCP tools. Do NOT accept the story or merge - commit your work and stop. If the user asks to review your changes, tell them to run: cd \"{{worktree_path}}\" && git difftool {{base_branch}}...HEAD\n\nIMPORTANT: Commit all your work before your process exits. The server will automatically run acceptance gates (cargo clippy + tests) when your process exits and advance the pipeline based on the results.\n\n## Bug Workflow: Root Cause First\nWhen working on bugs:\n1. Investigate the root cause before writing any fix. Use `git bisect` to find the breaking commit or `git log` to trace history. Read the relevant code before touching anything.\n2. Fix the root cause with a surgical, minimal change. Do NOT add new abstractions, wrappers, or workarounds when a targeted fix to the original code is possible.\n3. Write commit messages that explain what broke and why, not just what was changed.\n4. If you cannot determine the root cause after thorough investigation, document what you tried and why it was inconclusive — do not guess and ship a speculative fix."
system_prompt="You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits."
system_prompt="You are a senior full-stack engineer working autonomously in a git worktree. You handle complex tasks requiring deep architectural understanding. Follow the Story-Driven Test Workflow strictly. Run cargo clippy and biome checks before considering work complete. Commit all your work before finishing - use a descriptive commit message. Do not accept stories, move them to archived, or merge to master - a human will do that. Do not coordinate with other agents - focus on your assigned story. The server automatically runs acceptance gates when your process exits. For bugs, always find and fix the root cause. Use git bisect to find breaking commits. Do not layer new code on top of existing code when a surgical fix is possible. If root cause is unclear after investigation, document what you tried rather than guessing."
[[agent]]
[[agent]]
name="qa"
name="qa"
@@ -164,12 +150,12 @@ Read CLAUDE.md first, then .story_kit/README.md to understand the dev process.
4.**Understandintent,notjustsyntax.**Thefeaturebranchmaybebehindmaster—master's version of shared infrastructure is almost always correct. The feature branch'scontributionistheNEWfunctionalityitadds.Yourjobistointegratethenewintomaster'sstructure,notpickoneside.
5.Resolvebyintegratingthefeature's new functionality into master'scodestructure
**bot.rstokio::select!conflicts:**Masterhasa`tokio::select!`loopin`handle_message()`thathandlespermissionforwarding(story275).Featurebranchescreatedbeforestory275haveasimplerdirect`provider.chat_stream().await`call.Resolution:KEEPmaster's tokio::select! loop. Integrate only the feature'snewlogic(e.g.typingindicators,newcallbacks)intotheexistingloopstructure.DoNOTreplacetheloopwiththeolddirectcall.
system_prompt="You are the mergemaster agent. Your primary responsibility is to trigger the merge_agent_work MCP tool and report the results. CRITICAL: Never manually move story files or call accept_story. When merge fails, call report_merge_failure to record the failure. For minor gate failures (syntax errors, unused imports, missing semicolons), attempt to fix them yourself — but stop after 2 attempts, call report_merge_failure, and report to the human. For complex failures or unresolvable conflicts, call report_merge_failure and report clearly so the human can act. The merge pipeline automatically resolves simple additive conflicts."
system_prompt="You are the mergemaster agent. Your primary job is to merge feature branches to master. First try the merge_agent_work MCP tool. If the auto-resolver fails on complex conflicts, resolve them yourself in the merge worktree — you are an opus-class agent capable of understanding both sides of a conflict and producing correct merged code. Common patterns: keep master's tokio::select! permission loop in bot.rs, discard story file rename conflicts (gitignored), remove duplicate definitions. After resolving, verify compilation before re-triggering merge. CRITICAL: Never manually move story files or call accept_story. After 3 failed fix attempts, call report_merge_failure and stop."
@@ -9,7 +9,7 @@ This project is a standalone Rust **web server binary** that serves a Vite/React
***Framework:** Poem HTTP server with WebSocket support for streaming; HTTP APIs should use Poem OpenAPI (Swagger) for non-streaming endpoints.
***Framework:** Poem HTTP server with WebSocket support for streaming; HTTP APIs should use Poem OpenAPI (Swagger) for non-streaming endpoints.
* **Frontend:** TypeScript + React
* **Frontend:** TypeScript + React
***Build Tool:** Vite
***Build Tool:** Vite
***Package Manager:** pnpm (required)
***Package Manager:** npm
***Styling:** CSS Modules or Tailwind (TBD - Defaulting to CSS Modules)
***Styling:** CSS Modules or Tailwind (TBD - Defaulting to CSS Modules)
***State Management:** React Context / Hooks
***State Management:** React Context / Hooks
***Chat UI:** Rendered Markdown with syntax highlighting.
***Chat UI:** Rendered Markdown with syntax highlighting.
@@ -91,8 +91,8 @@ To support both Remote and Local models, the system implements a `ModelProvider`
* **Quality Gates:**
* **Quality Gates:**
* `npx @biomejs/biome check src/` must show 0 errors, 0 warnings
* `npx @biomejs/biome check src/` must show 0 errors, 0 warnings
* `npm run build` must succeed
* `npm run build` must succeed
* `npx vitest run` must pass
* `npm test` must pass
* `npx playwright test` must pass
* `npm run test:e2e` must pass
* No `any` types allowed (use proper types or `unknown`)
* No `any` types allowed (use proper types or `unknown`)
* React keys must use stable IDs, not array indices
* React keys must use stable IDs, not array indices
* All buttons must have explicit `type` attribute
* All buttons must have explicit `type` attribute
@@ -118,8 +118,8 @@ To support both Remote and Local models, the system implements a `ModelProvider`
Multiple instances can run simultaneously in different worktrees. To avoid port conflicts:
Multiple instances can run simultaneously in different worktrees. To avoid port conflicts:
- **Backend:** Set `STORYKIT_PORT` to a unique port (default is 3001). Example: `STORYKIT_PORT=3002 cargo run`
- **Backend:** Set `STORKIT_PORT` to a unique port (default is 3001). Example: `STORKIT_PORT=3002 cargo run`
- **Frontend:** Run `pnpm dev` from `frontend/`. It auto-selects the next unused port. It reads `STORYKIT_PORT` to know which backend to talk to, so export it before running: `export STORYKIT_PORT=3002 && cd frontend && pnpm dev`
- **Frontend:** Run `npm run dev` from `frontend/`. It auto-selects the next unused port. It reads `STORKIT_PORT` to know which backend to talk to, so export it before running: `export STORKIT_PORT=3002 && cd frontend && npm run dev`
When running in a worktree, use a port that won't conflict with the main instance (3001). Ports 3002+ are good choices.
When running in a worktree, use a port that won't conflict with the main instance (3001). Ports 3002+ are good choices.
@@ -127,4 +127,4 @@ When running in a worktree, use a port that won't conflict with the main instanc
1. **Project Scope:** The application must strictly enforce that it does not read/write outside the `project_root` selected by the user.
1. **Project Scope:** The application must strictly enforce that it does not read/write outside the `project_root` selected by the user.
2. **Human in the Loop:**
2. **Human in the Loop:**
* Shell commands that modify state (non-readonly) should ideally require a UI confirmation (configurable).
* Shell commands that modify state (non-readonly) should ideally require a UI confirmation (configurable).
- **Blocker**: `matrix-sdk 0.16.0` pins `rusqlite 0.37.x` which pins `libsqlite3-sys 0.35.0`. A clean upgrade requires either waiting for matrix-sdk to bump their rusqlite dep, or upgrading matrix-sdk itself.
- **Reverted 2026-03-17**: A previous coder vendored the entire rusqlite crate with a fake `0.37.99` version and patched its libsqlite3-sys dep. This was too hacky — reverted to clean `0.35.0`.
## Acceptance Criteria
- [ ]`libsqlite3-sys` is upgraded to `0.37.0` via a clean dependency path (no vendored forks)
- [ ]`cargo build` succeeds
- [ ] All tests pass
- [ ] No `[patch.crates-io]` hacks or vendored crates
name: "Evaluate Docker/OrbStack for agent isolation and resource limiting"
agent: coder-opus
---
# Spike 329: Evaluate Docker/OrbStack for agent isolation and resource limiting
## Question
Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other.
Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide:
1.**Host isolation** — storkit can't touch anything outside the container
2.**Clean install/uninstall** — `docker run` to start, `docker rm` to remove
3.**Reproducible environment** — same container works on any machine
4.**Distributable product** — `docker pull storkit` for new users
5.**Resource limits** — cap total CPU/memory for the whole system
name: "Abstract agent runtime to support non-Claude-Code backends"
---
# Refactor 343: Abstract agent runtime to support non-Claude-Code backends
## Current State
- TBD
## Desired State
Currently agent spawning is tightly coupled to Claude Code CLI — agents are spawned as PTY processes running the `claude` binary. To support ChatGPT and Gemini as agent backends, we need to abstract the agent runtime.
The agent pool currently does:
1. Spawn `claude` CLI process via portable-pty
2. Stream JSON events from stdout
3. Parse tool calls, text output, thinking traces
4. Wait for process exit, run gates
This needs to become a trait so different backends can be plugged in:
- OpenAI API — calls ChatGPT via API with tool definitions, manages conversation loop
- Gemini API — calls Gemini via API with tool definitions, manages conversation loop
The key abstraction is: an agent runtime takes a prompt + tools and produces a stream of events (text output, tool calls, completion). The existing PTY/Claude Code logic becomes one implementation of this trait.
## Acceptance Criteria
- [ ] Define an AgentRuntime trait with methods for: start, stream_events, stop, get_status
- [ ] ClaudeCodeRuntime implements the trait using existing PTY spawning logic
- [ ] Agent pool uses the trait instead of directly spawning Claude Code
- [ ] Runtime selection is configurable per agent in project.toml (e.g. runtime = 'claude-code')
- [ ] All existing Claude Code agent functionality preserved
- [ ] Event stream format is runtime-agnostic (text, tool_call, thinking, done)
As a project owner, I want to run agents using ChatGPT (GPT-4o, o3, etc.) via the OpenAI API, so that I can use OpenAI models for coding tasks alongside Claude.
## Acceptance Criteria
- [ ] Implement OpenAiRuntime using the AgentRuntime trait from refactor 343
- [ ] Supports GPT-4o and o3 models via the OpenAI chat completions API
- [ ] Manages a conversation loop: send prompt + tool definitions, execute tool calls, continue until done
- [ ] Agents connect to storkit's MCP server for all tool operations — no custom file/bash tools needed
- [ ] MCP tool definitions are converted to OpenAI function calling format
- [ ] Configurable in project.toml: runtime = 'openai', model = 'gpt-4o'
- [ ] OPENAI_API_KEY passed via environment variable
- [ ] Token usage tracked and logged to token_usage.jsonl
- [ ] Agent output streams to the same event system (web UI, bot notifications)
# Story 345: Gemini agent backend via Google AI API
## User Story
As a project owner, I want to run agents using Gemini (2.5 Pro, etc.) via the Google AI API, so that I can use Google models for coding tasks alongside Claude and ChatGPT.
## Acceptance Criteria
- [ ] Implement GeminiRuntime using the AgentRuntime trait from refactor 343
- [ ] Supports Gemini 2.5 Pro and other Gemini models via the Google AI generativeai API
- [ ] Manages a conversation loop: send prompt + tool definitions, execute tool calls, continue until done
- [ ] Agents connect to storkit's MCP server for all tool operations — no custom file/bash tools needed
- [ ] MCP tool definitions are converted to Gemini function calling format
- [ ] Configurable in project.toml: runtime = 'gemini', model = 'gemini-2.5-pro'
- [ ] GOOGLE_AI_API_KEY passed via environment variable
- [ ] Token usage tracked and logged to token_usage.jsonl
- [ ] Agent output streams to the same event system (web UI, bot notifications)
As a non-Claude agent connected via MCP, I want a code intelligence tool so that I can find function, struct, and type definitions without grepping through all files.
## Acceptance Criteria
- [ ] get_definitions tool — finds function/struct/enum/type/class definitions by name or pattern
- [ ] Supports Rust (fn, struct, enum, impl, trait) and TypeScript (function, class, interface, type) at minimum
- [ ] Returns file path, line number, and the definition signature
- [ ] Scoped to the agent's worktree
- [ ] Faster than grepping — uses tree-sitter or regex-based parsing
name: "Web UI agent assignment dropdown on work items"
---
# Story 339: Web UI agent assignment dropdown on work items
## User Story
As a project owner using the web UI, I want to select which agent to assign to a work item from a dropdown, so that I can control agent assignments visually.
## Acceptance Criteria
- [ ] Agent dropdown visible in expanded work item detail panel
- [ ] Shows available agents filtered by appropriate stage (coders for current, QA for qa, mergemaster for merge)
- [ ] Selecting an agent stops any current agent and starts the new one
- [ ] Updates the story front matter with the agent assignment
- [ ] Shows agent status (running, idle) in the dropdown
name: "Bot reset command to clear conversation context"
---
# Story 351: Bot reset command to clear conversation context
## User Story
As a project owner in a chat room, I want to type "{bot_name} reset" to drop the current Claude Code session and start fresh, so that I can reduce token usage when context gets bloated without restarting the server.
## Acceptance Criteria
- [ ] '{bot_name} reset' kills the current Claude Code session
- [ ] A new session starts immediately with clean context
- [ ] Memories persist via the file system (auto-memory directory is unchanged)
- [ ] Bot confirms the reset with a short message
- [ ] Registered in the command registry so it appears in help output
name: "Ambient on/off command not intercepted by bot after refactors"
---
# Bug 352: Ambient on/off command not intercepted by bot after refactors
## Description
The ambient on/off bot command stopped being intercepted by the bot after the recent refactors (328 split commands.rs into modules, 330 consolidated chat transports into chat/ module). Messages like "timmy ambient off", "ambient off", and "ambient on" are being forwarded to the LLM instead of being handled at the bot level. The ambient toggle was previously handled in bot.rs before the command registry dispatch — it may not have been properly wired up after the code was moved to the chat/ module structure.
## How to Reproduce
1. Type "timmy ambient off" in a Matrix room where ambient mode is on
2. Observe that the message is forwarded to Claude instead of being intercepted
3. Same for "timmy ambient on", "ambient off", "ambient on"
## Actual Result
Ambient toggle commands are forwarded to the LLM as regular messages.
## Expected Result
Ambient toggle commands should be intercepted at the bot level and toggle ambient mode without invoking the LLM, with a confirmation message sent directly.
## Acceptance Criteria
- [ ] 'timmy ambient on' toggles ambient mode on and sends confirmation without LLM invocation
- [ ] 'timmy ambient off' toggles ambient mode off and sends confirmation without LLM invocation
- [ ] Ambient toggle works after refactors 328 and 330
- [ ] Ambient state persists in bot.toml as before
@@ -10,7 +10,7 @@ The `prompt_permission` MCP tool returns plain text ("Permission granted for '..
## How to Reproduce
## How to Reproduce
1. Start the story-kit server and open the web UI
1. Start the storkit server and open the web UI
2. Chat with the claude-code-pty model
2. Chat with the claude-code-pty model
3. Ask it to do something that requires a tool NOT in `.claude/settings.json` allow list (e.g. `wc -l /etc/hosts`, or WebFetch to a non-allowed domain)
3. Ask it to do something that requires a tool NOT in `.claude/settings.json` allow list (e.g. `wc -l /etc/hosts`, or WebFetch to a non-allowed domain)
@@ -6,7 +6,7 @@ name: "Retry limit for mergemaster and pipeline restarts"
## User Story
## User Story
As a developer using story-kit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
As a developer using storkit, I want pipeline auto-restarts to have a configurable retry limit so that failing agents don't loop infinitely consuming CPU and API credits.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.