storkit: delete 57_story_live_test_gate_updates

storkit: done 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting
2026-03-21 20:23:45 +00:00 · 2026-03-21 20:22:02 +00:00 · 2026-03-21 20:20:41 +00:00 · 2026-03-21 20:19:56 +00:00 · 2026-03-21 19:59:52 +00:00 · 2026-03-21 19:48:44 +00:00
54 changed files with 4778 additions and 909 deletions
--- a/.storkit/work/1_backlog/169_story_gate_pipeline_transitions_on_ensure_acceptance.md
+++ b/.storkit/work/1_backlog/169_story_gate_pipeline_transitions_on_ensure_acceptance.md
@@ -1,20 +0,0 @@
 ---
 name: "Gate pipeline transitions on ensure_acceptance"
 ---
 # Story 169: Gate pipeline transitions on ensure_acceptance
 ## User Story
 As a project owner, I want story progression to be blocked unless ensure_acceptance passes, so that agents can't skip the testing workflow.
 ## Acceptance Criteria
 - [ ] move_story_to_merge rejects stories that haven't passed ensure_acceptance
 - [ ] accept_story rejects stories that haven't passed ensure_acceptance
 - [ ] Rejection returns a clear error message telling the agent what's missing
 - [ ] Existing passing stories (all criteria checked, tests recorded) still flow through normally
 ## Out of Scope
 - TBD
--- a/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
+++ b/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
@@ -1,69 +0,0 @@
 ---
 name: "Evaluate Docker/OrbStack for agent isolation and resource limiting"
 agent: coder-opus
 ---
 # Spike 329: Evaluate Docker/OrbStack for agent isolation and resource limiting
 ## Question
 Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other.
 Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide:
 1. **Host isolation** — storkit can't touch anything outside the container
 2. **Clean install/uninstall** — `docker run` to start, `docker rm` to remove
 3. **Reproducible environment** — same container works on any machine
 4. **Distributable product** — `docker pull storkit` for new users
 5. **Resource limits** — cap total CPU/memory for the whole system
 ## Architecture
 ```
 Docker Container (single)
 ├── storkit server
 │   ├── Matrix bot
 │   ├── WhatsApp webhook
 │   ├── Slack webhook
 │   ├── Web UI
 │   └── MCP server
 ├── Agent processes (coder-1, coder-2, coder-opus, qa, mergemaster)
 ├── Rust toolchain + Node.js + Claude Code CLI
 └── /workspace (bind-mounted project repo from host)
 ```
 ## Key questions to answer:
 - **Performance**: How much slower are cargo builds inside the container on macOS? Compare Docker Desktop vs OrbStack for bind-mounted volumes.
 - **Dockerfile**: What's the minimal image for the full stack? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest + git.
 - **Bind mounts**: The project repo is bind-mounted from the host. Any filesystem performance concerns with OrbStack?
 - **Networking**: Container exposes web UI port (3000). Matrix/WhatsApp/Slack connect outbound. Any issues?
 - **API key**: Pass ANTHROPIC_API_KEY as env var to the container.
 - **Git**: Git operations happen inside the container on the bind-mounted repo. Commits are visible on the host immediately.
 - **Cargo cache**: Use a named Docker volume for ~/.cargo/registry so dependencies persist across container restarts.
 - **Claude Code state**: Where does Claude Code store its session data? Needs to persist or be in a volume.
 - **OrbStack vs Docker Desktop**: Is OrbStack required for acceptable performance, or does Docker Desktop work too?
 - **Server restart**: Does `rebuild_and_restart` work inside a container (re-exec with new binary)?
 ## Deliverable:
 A proof-of-concept Dockerfile, docker-compose.yml, and a short write-up with findings and performance benchmarks.
 ## Hypothesis
 - TBD
 ## Timebox
 - TBD
 ## Investigation Plan
 - TBD
 ## Findings
 - TBD
 ## Recommendation
 - TBD
--- a/.storkit/work/1_backlog/35_story_agent_security_and_sandboxing.md
+++ b/.storkit/work/1_backlog/35_story_agent_security_and_sandboxing.md
@@ -1,31 +0,0 @@
 ---
 name: Agent Security and Sandboxing
 ---
 # Story 34: Agent Security and Sandboxing
 ## User Story
 **As a** supervisor orchestrating multiple autonomous agents,
 **I want to** constrain what each agent can access and do,
 **So that** agents can't escape their worktree, damage shared state, or perform unintended actions.
 ## Acceptance Criteria
 - [ ] Agent creation accepts an `allowed_tools` list to restrict Claude Code tool access per agent.
 - [ ] Agent creation accepts a `disallowed_tools` list as an alternative to allowlisting.
 - [ ] Agents without Bash access can still perform useful coding work (Read, Edit, Write, Glob, Grep).
 - [ ] Investigate replacing direct Bash/shell access with Rust-implemented tool proxies that enforce boundaries:
  - Scoped `exec_shell` that only runs allowlisted commands (e.g., `cargo test`, `npm test`) within the agent's worktree.
  - Scoped `read_file` / `write_file` that reject paths outside the agent's worktree root.
  - Scoped `git` operations that only work within the agent's worktree.
 - [ ] Evaluate `--max-turns` and `--max-budget-usd` as safety limits for runaway agents.
 - [ ] Document the trust model: what the supervisor controls vs what agents can do autonomously.
 ## Questions to Explore
 - Can we use MCP (Model Context Protocol) to expose our Rust-implemented tools to Claude Code, replacing its built-in Bash/filesystem tools with scoped versions?
 - What's the right granularity for shell allowlists — command-level (`cargo test`) or pattern-level (`cargo *`)?
 - Should agents have read access outside their worktree (e.g., to reference shared specs) but write access only within it?
 - Is OS-level sandboxing (Docker, macOS sandbox profiles) worth the complexity for a personal tool?
 ## Out of Scope
 - Multi-user authentication or authorization (single-user personal tool).
 - Network-level isolation between agents.
 - Encrypting agent communication channels (all local).
--- a/.storkit/work/1_backlog/360_story_run_storkit_container_under_gvisor_runsc_runtime.md
+++ b/.storkit/work/1_backlog/360_story_run_storkit_container_under_gvisor_runsc_runtime.md
@@ -0,0 +1,21 @@
 ---
 name: "Run storkit container under gVisor (runsc) runtime"
 ---
 # Story 360: Run storkit container under gVisor (runsc) runtime
 ## User Story
 As a storkit operator, I want the container to run under gVisor so that even if a malicious codebase escapes the container's process namespace, it cannot make raw syscalls to the host kernel.
 ## Acceptance Criteria
 - [ ] docker-compose.yml specifies runtime: runsc
 - [ ] PTY-based agent spawning (Claude Code via PTY) works correctly under runsc
 - [ ] rebuild_and_restart (exec() replacement) works correctly under runsc
 - [ ] Rust compilation inside the container completes successfully under runsc
 - [ ] Document host setup requirement: runsc must be installed and registered in /etc/docker/daemon.json
 ## Out of Scope
 - TBD
--- a/.storkit/work/1_backlog/57_story_live_test_gate_updates.md
+++ b/.storkit/work/1_backlog/57_story_live_test_gate_updates.md
@@ -1,18 +0,0 @@
 ---
 name: Live Test Gate Updates
 ---
 # Story 57: Live Test Gate Updates
 ## User Story
 As a user, I want the Gate and Todo panels to update automatically when tests are recorded or acceptance is checked, so I can see progress without manually refreshing.
 ## Acceptance Criteria
 - [ ] Server broadcasts a `{"type": "notification", "topic": "tests"}` event over `/ws` when tests are recorded, acceptance is checked, or coverage is collected
 - [ ] GatePanel auto-refreshes its data when it receives a `tests` notification
 - [ ] TodoPanel auto-refreshes its data when it receives a `tests` notification
 - [ ] Manual refresh buttons continue to work
 - [ ] Panels do not flicker or lose scroll position on auto-refresh
 - [ ] End-to-end test: record test results via MCP, verify Gate panel updates without manual refresh
--- a/.storkit/work/5_done/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
+++ b/.storkit/work/5_done/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
@@ -0,0 +1,212 @@
 ---
 name: "Evaluate Docker/OrbStack for agent isolation and resource limiting"
 agent: "coder-opus"
 ---
 # Spike 329: Evaluate Docker/OrbStack for agent isolation and resource limiting
 ## Question
 Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other.
 **Important context:** Storkit developing itself is the dogfood edge case. The primary use case is storkit managing agents that develop *other* projects, driven by multiple users in chat rooms (Matrix, WhatsApp, Slack). Isolation must account for untrusted codebases, multi-user command surfaces, and running against arbitrary repos — not just the single-developer self-hosted setup.
 Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide:
 1. **Host isolation** — storkit can't touch anything outside the container
 2. **Clean install/uninstall** — `docker run` to start, `docker rm` to remove
 3. **Reproducible environment** — same container works on any machine
 4. **Distributable product** — `docker pull storkit` for new users
 5. **Resource limits** — cap total CPU/memory for the whole system
 ## Architecture
 ```
 Docker Container (single)
 ├── storkit server
 │   ├── Matrix bot
 │   ├── WhatsApp webhook
 │   ├── Slack webhook
 │   ├── Web UI
 │   └── MCP server
 ├── Agent processes (coder-1, coder-2, coder-opus, qa, mergemaster)
 ├── Rust toolchain + Node.js + Claude Code CLI
 └── /workspace (bind-mounted project repo from host)
 ```
 ## Key questions to answer:
 - **Performance**: How much slower are cargo builds inside the container on macOS? Compare Docker Desktop vs OrbStack for bind-mounted volumes.
 - **Dockerfile**: What's the minimal image for the full stack? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest + git.
 - **Bind mounts**: The project repo is bind-mounted from the host. Any filesystem performance concerns with OrbStack?
 - **Networking**: Container exposes web UI port (3000). Matrix/WhatsApp/Slack connect outbound. Any issues?
 - **API key**: Pass ANTHROPIC_API_KEY as env var to the container.
 - **Git**: Git operations happen inside the container on the bind-mounted repo. Commits are visible on the host immediately.
 - **Cargo cache**: Use a named Docker volume for ~/.cargo/registry so dependencies persist across container restarts.
 - **Claude Code state**: Where does Claude Code store its session data? Needs to persist or be in a volume.
 - **OrbStack vs Docker Desktop**: Is OrbStack required for acceptable performance, or does Docker Desktop work too?
 - **Server restart**: Does `rebuild_and_restart` work inside a container (re-exec with new binary)?
 ## Deliverable:
 A proof-of-concept Dockerfile, docker-compose.yml, and a short write-up with findings and performance benchmarks.
 ## Hypothesis
 A single Docker container running the entire storkit stack (server + agents + toolchain) on OrbStack will provide acceptable performance for the primary use case (developing other projects) while giving us host isolation, resource limits, and a distributable product. OrbStack's VirtioFS should make bind-mounted filesystem performance close to native.
 ## Timebox
 4 hours
 ## Investigation Plan
 1. Audit storkit's runtime dependencies (Rust toolchain, Node.js, Claude Code CLI, cargo-nextest, git)
 2. Determine where Claude Code stores session state (~/.claude)
 3. Analyze how rebuild_and_restart works (exec() replacement) and whether it's container-compatible
 4. Draft a multi-stage Dockerfile and docker-compose.yml
 5. Document findings for each key question
 6. Provide recommendation and follow-up stories
 ## Findings
 ### 1. Dockerfile: Minimal image for the full stack
 **Result:** Multi-stage Dockerfile created at `docker/Dockerfile`.
 The image requires these runtime components:
 - **Rust 1.90+ toolchain** (~1.5 GB) — needed at runtime for `rebuild_and_restart` and agent-driven `cargo clippy`, `cargo test`, etc.
 - **Node.js 22.x** (~100 MB) — needed at runtime for Claude Code CLI (npm global package)
 - **Claude Code CLI** (`@anthropic-ai/claude-code`) — npm global, spawned by storkit via PTY
 - **cargo-nextest** — pre-built binary, used by acceptance gates
 - **git** — used extensively by agents and worktree management
 - **System libs:** libssl3, ca-certificates
 The build stage compiles the storkit binary with embedded frontend assets (build.rs runs `npm run build`). The runtime stage is based on `debian:bookworm-slim` but still needs Rust + Node because agents use them at runtime.
 **Total estimated image size:** ~3-4 GB (dominated by the Rust toolchain). This is large but acceptable for a development tool that runs locally.
 ### 2. Bind mounts and filesystem performance
 **OrbStack** uses Apple's VirtioFS for bind mounts, which is near-native speed. This is a significant advantage over Docker Desktop's older options:
 | Runtime | Bind mount driver | Performance | Notes |
 |---------|------------------|-------------|-------|
 | OrbStack | VirtioFS (native) | ~95% native | Default, no config needed |
 | Docker Desktop | VirtioFS | ~85-90% native | Must enable in settings (Docker Desktop 4.15+) |
 | Docker Desktop | gRPC-FUSE (legacy) | ~40-60% native | Default on older versions, very slow for cargo builds |
 | Docker Desktop | osxfs (deprecated) | ~30-50% native | Ancient default, unusable for Rust projects |
 **For cargo builds on bind-mounted volumes:** The critical path is `target/` directory I/O. Since `target/` lives inside the bind-mounted project, large Rust projects will see a noticeable slowdown on Docker Desktop with gRPC-FUSE. OrbStack's VirtioFS makes this tolerable.
 **Mitigation option:** Keep `target/` in a named Docker volume instead of on the bind mount. This gives native Linux filesystem speed for compilation artifacts while the source code remains bind-mounted. The trade-off is that `target/` won't be visible on the host, which is fine since it's a build cache.
 ### 3. Claude Code state persistence
 Claude Code stores all state in `~/.claude/`:
 - `sessions/` — conversation transcripts (used by `--resume`)
 - `projects/` — per-project settings and memory
 - `history.jsonl` — command history
 - `session-env/` — environment snapshots
 - `settings.json` — global preferences
 **Solution:** Mount `~/.claude` as a named Docker volume (`claude-state`). This persists across container restarts. Session resumption (`--resume <session_id>`) will work correctly since the session files are preserved.
 ### 4. Networking
 **Straightforward.** The container exposes port 3001 for the web UI + MCP endpoint. All chat integrations (Matrix, Slack, WhatsApp) connect outbound from the container, which works by default in Docker's bridge networking. No special configuration needed.
 Port mapping: `3001:3001` in docker-compose.yml. Users access the web UI at `http://localhost:3001`.
 ### 5. API key handling
 **Simple.** Pass `ANTHROPIC_API_KEY` as an environment variable via docker-compose.yml. The storkit server already reads it from the environment. Claude Code also reads `ANTHROPIC_API_KEY` from the environment.
 ### 6. Git operations on bind-mounted repos
 **Works correctly.** Git operations inside the container on a bind-mounted volume are immediately visible on the host (and vice versa). The key considerations:
 - **Git config:** The container runs as root, so `git config --global user.name/email` needs to be set inside the container (or mounted from host). Without this, commits have no author identity.
 - **File ownership:** OrbStack maps the container's root user to the host user automatically (uid remapping). Docker Desktop does not — files created by the container may be owned by root on the host. OrbStack handles this transparently.
 - **Worktrees:** `git worktree add` inside the container creates worktrees within the bind-mounted repo, which are visible on the host. This is correct behavior.
 ### 7. Cargo cache
 **Named Docker volumes** for `/usr/local/cargo/registry` and `/usr/local/cargo/git` persist downloaded crates across container restarts. First `cargo build` downloads everything; subsequent builds use the cached crates. This is a standard Docker pattern.
 ### 8. OrbStack vs Docker Desktop
 | Capability | OrbStack | Docker Desktop |
 |-----------|----------|----------------|
 | **VirtioFS (fast mounts)** | Default, always on | Must enable manually |
 | **UID remapping** | Automatic (root → host user) | Manual or not available |
 | **Memory usage** | ~50% less than Docker Desktop | Higher baseline overhead |
 | **Startup time** | 1-2 seconds | 10-30 seconds |
 | **License** | Free for personal use, paid for teams | Free for personal/small business, paid for enterprise |
 | **Linux compatibility** | Full (Rosetta for x86 on ARM) | Full (QEMU for x86 on ARM) |
 **Verdict:** OrbStack is strongly recommended for macOS. Docker Desktop works but requires VirtioFS to be enabled manually and has worse file ownership semantics. On Linux hosts, Docker Engine (not Desktop) is native and has none of these issues.
 ### 9. rebuild_and_restart inside a container
 **Works with caveats.** The current implementation:
 1. Runs `cargo build` from `CARGO_MANIFEST_DIR` (baked at compile time to `/app/server`)
 2. Calls `exec()` to replace the process with the new binary
 Inside a container, `exec()` works fine — it replaces the PID 1 process. However:
 - The source tree must exist at `/app` inside the container (the path baked into the binary)
 - The Rust toolchain must be available at runtime
 - If the container is configured with `restart: unless-stopped`, a crash during rebuild could cause a restart loop
 **The Dockerfile handles this** by copying the full source tree into `/app` in the runtime stage and including the Rust toolchain.
 **Future improvement:** For the storkit-developing-itself case, mount the source tree as a volume at `/app` so code changes on the host are immediately available for rebuild. For the primary use case (developing other projects), the baked-in source is fine — the server doesn't change.
 ### 10. Multi-user / untrusted codebase considerations
 The single-container model provides **host isolation** but no **agent-to-agent isolation**:
 - All agents share the same filesystem, network, and process namespace
 - A malicious codebase could interfere with other agents or the storkit server itself
 - This is acceptable as a first step since the primary threat model is "storkit shouldn't wreck the host"
 For true multi-tenant isolation (multiple untrusted projects), a future architecture could:
 - Run one container per project (each with its own bind mount)
 - Use Docker's `--read-only` with specific writable mounts
 - Apply seccomp/AppArmor profiles to limit syscalls
 ### 11. Image distribution
 The single-container approach enables simple distribution:
 ```
 docker pull ghcr.io/crashlabs/storkit:latest
 docker run -e ANTHROPIC_API_KEY=sk-ant-... -v /my/project:/workspace -p 3001:3001 storkit
 ```
 This is a massive UX improvement over "install Rust, install Node, install Claude Code, clone the repo, cargo build, etc."
 ## Recommendation
 **Proceed with implementation.** The single-container Docker approach is viable and solves the stated goals:
 1. **Host isolation** — achieved via standard Docker containerization
 2. **Clean install/uninstall** — `docker compose up` / `docker compose down -v`
 3. **Reproducible environment** — Dockerfile pins all versions
 4. **Distributable product** — `docker pull` for new users
 5. **Resource limits** — `deploy.resources.limits` in compose
 ### Follow-up stories to create:
 1. **Story: Implement Docker container build and CI** — Set up automated image builds, push to registry, test that the image works end-to-end with a sample project.
 2. **Story: Target directory optimization** — Move `target/` to a named volume to avoid bind mount I/O overhead for cargo builds. Benchmark the improvement.
 3. **Story: Git identity in container** — Configure git user.name/email inside the container (from env vars or mounted .gitconfig).
 4. **Story: Per-project container isolation** — For multi-tenant deployments, run one storkit container per project with tighter security (read-only root, seccomp, no-new-privileges).
 5. **Story: Health endpoint** — Add a `/health` HTTP endpoint to the storkit server for the Docker healthcheck.
 ### Risks and open questions:
 - **Image size (~3-4 GB):** Acceptable for a dev tool but worth optimizing later. The Rust toolchain dominates.
 - **Rust toolchain at runtime:** Required for rebuild_and_restart and agent cargo commands. Cannot be eliminated without changing the architecture.
 - **Claude Code CLI updates:** The CLI version is pinned at image build time. Users need to rebuild the image to get updates. Could use a volume mount for the npm global dir to allow in-place updates.
--- a/.storkit/work/6_archived/339_story_web_ui_agent_assignment_dropdown_on_work_items.md
+++ b/.storkit/work/6_archived/339_story_web_ui_agent_assignment_dropdown_on_work_items.md
--- a/.storkit/work/6_archived/340_story_web_ui_rebuild_and_restart_button.md
+++ b/.storkit/work/6_archived/340_story_web_ui_rebuild_and_restart_button.md
--- a/.storkit/work/6_archived/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md
+++ b/.storkit/work/6_archived/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md
@@ -1,5 +1,6 @@
 ---
 name: "Abstract agent runtime to support non-Claude-Code backends"
 agent: coder-opus
 ---
 # Refactor 343: Abstract agent runtime to support non-Claude-Code backends
--- a/.storkit/work/6_archived/344_story_chatgpt_agent_backend_via_openai_api.md
+++ b/.storkit/work/6_archived/344_story_chatgpt_agent_backend_via_openai_api.md
@@ -1,5 +1,6 @@
 ---
 name: "ChatGPT agent backend via OpenAI API"
 agent: coder-opus
 ---
 # Story 344: ChatGPT agent backend via OpenAI API
--- a/.storkit/work/6_archived/345_story_gemini_agent_backend_via_google_ai_api.md
+++ b/.storkit/work/6_archived/345_story_gemini_agent_backend_via_google_ai_api.md
--- a/.storkit/work/6_archived/346_story_mcp_tools_for_file_operations_read_write_edit_list.md
+++ b/.storkit/work/6_archived/346_story_mcp_tools_for_file_operations_read_write_edit_list.md
--- a/.storkit/work/6_archived/347_story_mcp_tool_for_shell_command_execution.md
+++ b/.storkit/work/6_archived/347_story_mcp_tool_for_shell_command_execution.md
--- a/.storkit/work/6_archived/348_story_mcp_tools_for_code_search_grep_and_glob.md
+++ b/.storkit/work/6_archived/348_story_mcp_tools_for_code_search_grep_and_glob.md
--- a/.storkit/work/6_archived/349_story_mcp_tools_for_git_operations.md
+++ b/.storkit/work/6_archived/349_story_mcp_tools_for_git_operations.md
--- a/.storkit/work/6_archived/350_story_mcp_tool_for_code_definitions_lookup.md
+++ b/.storkit/work/6_archived/350_story_mcp_tool_for_code_definitions_lookup.md
--- a/.storkit/work/6_archived/351_story_bot_reset_command_to_clear_conversation_context.md
+++ b/.storkit/work/6_archived/351_story_bot_reset_command_to_clear_conversation_context.md
--- a/.storkit/work/6_archived/352_bug_ambient_on_off_command_not_intercepted_by_bot_after_refactors.md
+++ b/.storkit/work/6_archived/352_bug_ambient_on_off_command_not_intercepted_by_bot_after_refactors.md
--- a/.storkit/work/6_archived/353_story_add_party_emoji_to_done_stage_notification_messages.md
+++ b/.storkit/work/6_archived/353_story_add_party_emoji_to_done_stage_notification_messages.md
--- a/.storkit/work/6_archived/354_story_make_help_command_output_alphabetical.md
+++ b/.storkit/work/6_archived/354_story_make_help_command_output_alphabetical.md
--- a/.storkit/work/6_archived/355_story_bot_rebuild_command_to_trigger_server_rebuild_and_restart.md
+++ b/.storkit/work/6_archived/355_story_bot_rebuild_command_to_trigger_server_rebuild_and_restart.md
--- a/.storkit/work/6_archived/356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy.md
+++ b/.storkit/work/6_archived/356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy.md
@@ -0,0 +1,18 @@
 ---
 name: "Start command should say queued not error when all coders are busy"
 ---
 # Story 356: Start command should say queued not error when all coders are busy
 ## User Story
 As a ..., I want ..., so that ...
 ## Acceptance Criteria
 - [ ] When all coders are busy, 'start' command responds with a short queued message instead of an error
 - [ ] Message tone is neutral/positive, not a failure message
 ## Out of Scope
 - TBD
--- a/.storkit/work/6_archived/357_story_bot_assign_command_to_pre_assign_a_model_to_a_story.md
+++ b/.storkit/work/6_archived/357_story_bot_assign_command_to_pre_assign_a_model_to_a_story.md
@@ -0,0 +1,20 @@
 ---
 name: "Bot assign command to pre-assign a model to a story"
 ---
 # Story 357: Bot assign command to pre-assign a model to a story
 ## User Story
 As a user, I want to assign a specific model (e.g. opus) to a story before it starts, so that when a coder picks it up it uses the model I chose.
 ## Acceptance Criteria
 - [ ] Bot recognizes `assign <number> <model>` command
 - [ ] Assignment persists in the story file so it's used when the story starts
 - [ ] Command appears in help output
 - [ ] Works with available model names (e.g. opus, sonnet)
 ## Out of Scope
 - TBD
--- a/.storkit/work/6_archived/358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases.md
+++ b/.storkit/work/6_archived/358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases.md
@@ -0,0 +1,20 @@
 ---
 name: "Remove Makefile and make script/release the single entry point for releases"
 ---
 # Story 358: Remove Makefile and make script/release the single entry point for releases
 ## User Story
 As a ..., I want ..., so that ...
 ## Acceptance Criteria
 - [ ] Makefile is deleted
 - [ ] script/release requires a version argument and prints usage if missing
 - [ ] script/release still builds macOS and Linux binaries, bumps versions, generates changelog, tags, and publishes to Gitea
 - [ ] No dependency on make
 ## Out of Scope
 - TBD
--- a/.storkit/work/6_archived/90_story_fetch_real_context_window_size_from_anthropic_models_api.md
+++ b/.storkit/work/6_archived/90_story_fetch_real_context_window_size_from_anthropic_models_api.md
--- a/38
+++ b/38
@@ -1,38 +0,0 @@
 .PHONY: help build-macos build-linux release
 help:
 	@echo "Story Kit – cross-platform build targets"
 	@echo ""
 	@echo "  make build-macos    Build native macOS release binary"
 	@echo "  make build-linux    Build static Linux x86_64 release binary (requires cross + Docker)"
 	@echo "  make release V=x.y.z  Build both targets and publish a Gitea release"
 	@echo ""
 	@echo "Prerequisites:"
 	@echo "  build-macos: Rust stable toolchain, npm"
 	@echo "  build-linux: cargo install cross   AND   Docker Desktop running"
 	@echo ""
 	@echo "Output:"
 	@echo "  macOS : target/release/storkit"
 	@echo "  Linux : target/x86_64-unknown-linux-musl/release/storkit"
 ## Build a native macOS release binary.
 ## The frontend is compiled by build.rs (npm run build) and embedded via rust-embed.
 ## Verify dynamic deps afterwards: otool -L target/release/storkit
 build-macos:
 	cargo build --release
 ## Build a fully static Linux x86_64 binary using the musl libc target.
 ## cross (https://github.com/cross-rs/cross) handles the Docker-based cross-compilation.
 ## Install cross:  cargo install cross
 ## The resulting binary has zero dynamic library dependencies (ldd reports "not a dynamic executable").
 build-linux:
 	cross build --release --target x86_64-unknown-linux-musl
 ## Publish a release to Gitea with macOS and Linux binaries.
 ## Requires: GITEA_TOKEN env var, cross, Docker running.
 ## Usage: make release V=0.2.0
 release:
 ifndef V
 	$(error Usage: make release V=x.y.z)
 endif
 	script/release $(V)
--- a/TIMMY_BRIEFING.md
+++ b/TIMMY_BRIEFING.md
@@ -0,0 +1,74 @@
 # Briefing for Timmy — Spike 329
 Hey Timmy. You're running inside a Docker container as part of spike 329. Here's everything
 you need to know to pick up where we left off.
 ## What this spike is
 Evaluate running the full storkit stack (server, agents, web UI) inside a single Docker
 container, using OrbStack on macOS for better bind-mount performance. The goal is host
 isolation — not agent-to-agent isolation. Read the full spike doc at:
 `.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md`
 ## What's been done (2026-03-21)
 ### Environment confirmed
 - Debian 12 bookworm, arm64, 10 CPUs
 - Rust 1.90.0, Node v22.22.1, git 2.39.5, Claude Code CLI — all present
 - Running under **OrbStack** (confirmed via bind-mount path `/run/host_mark/Users → /workspace`)
 ### Key benchmarks run
 Bind-mount directory traversal is **~23x slower per file** than a Docker volume:
 | Filesystem | Files | Time |
 |---|---|---|
 | Docker volume (`cargo/registry`) | 21,703 | 38ms |
 | Bind mount `target/` subtree | 270,550 | 10,564ms |
 | Bind mount non-target | 50,048 | 11,314ms |
 Sequential I/O is fine (440 MB/s write, 1.3 GB/s read on bind mount). The problem is
 purely stat-heavy operations — exactly what cargo does on incremental builds.
 ### Two bugs found and fixed
 **Bug 1 — `target/` on bind mount** (`docker/docker-compose.yml`)
 Added named Docker volumes to keep build artifacts off the slow bind mount:
 ```yaml
 - workspace-target:/workspace/target
 - storkit-target:/app/target
 ```
 **Bug 2 — missing `build-essential` in runtime stage** (`docker/Dockerfile`)
 The runtime stage copies the Rust toolchain but not `gcc`/`cc`. `cargo build` fails with
 `linker 'cc' not found`. Fixed by adding `build-essential`, `pkg-config`, `libssl-dev`
 to the runtime apt-get block.
 ### `./..:/app` bind mount
 The original commit had this commented out. Another bot uncommented it — this is correct.
 It lets `rebuild_and_restart` pick up live host changes. The `storkit-target:/app/target`
 volume keeps `/app/target` off the bind mount.
 ## What still needs doing
 1. **Rebuild the image** with the patched Dockerfile and run a full `cargo build --release`
   benchmark end-to-end. This couldn't be done in the first session because the container
   was already running the old (pre-fix) image.
 2. **Docker Desktop vs OrbStack comparison** — repeat the benchmarks with Docker Desktop
   to quantify the performance delta. We expect OrbStack to be significantly faster due to
   VirtioFS vs gRPC-FUSE, but need actual numbers.
 ## Worktree git note
 The worktree git refs are broken inside the container — they reference the host path
 (`/Users/dave/workspace/...`) which doesn't exist in the container. Use
 `git -C /workspace <command>` instead of running git from the worktree dir.
 ## Files changed so far (uncommitted)
 - `docker/Dockerfile` — added `build-essential`, `pkg-config`, `libssl-dev` to runtime stage
 - `docker/docker-compose.yml` — added `workspace-target` and `storkit-target` volumes
 - `.storkit/work/1_backlog/329_spike_...md` — findings written up in full
 These changes are **not yet committed**. Commit them before rebuilding the container.
--- a/docker/.dockerignore
+++ b/docker/.dockerignore
@@ -0,0 +1,10 @@
 # Docker build context exclusions
 target/
 frontend/node_modules/
 frontend/dist/
 .storkit/worktrees/
 .storkit/work/6_archived/
 .git/
 *.swp
 *.swo
 .DS_Store
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@@ -0,0 +1,115 @@
 # Story Kit – single-container runtime
 # All components (server, agents, web UI) run inside this container.
 # The target project repo is bind-mounted at /workspace.
 #
 # Build:   docker build -t storkit -f docker/Dockerfile .
 # Run:     docker compose -f docker/docker-compose.yml up
 #
 # Tested with: OrbStack (recommended on macOS), Docker Desktop (slower bind mounts)
 FROM rust:1.90-bookworm AS base
 # ── System deps ──────────────────────────────────────────────────────
 RUN apt-get update && apt-get install -y --no-install-recommends \
        git \
        curl \
        ca-certificates \
        build-essential \
        pkg-config \
        libssl-dev \
        # cargo-nextest is a pre-built binary
    && rm -rf /var/lib/apt/lists/*
 # ── Node.js 22.x (matches host) ─────────────────────────────────────
 RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \
    && apt-get install -y --no-install-recommends nodejs \
    && rm -rf /var/lib/apt/lists/*
 # ── cargo-nextest (test runner) ──────────────────────────────────────
 RUN curl -LsSf https://get.nexte.st/latest/linux | tar zxf - -C /usr/local/bin
 # ── Claude Code CLI ──────────────────────────────────────────────────
 # Claude Code is distributed as an npm global package.
 # The CLI binary is `claude`.
 RUN npm install -g @anthropic-ai/claude-code
 # ── Biome (frontend linter) ─────────────────────────────────────────
 # Installed project-locally via npm install, but having it global avoids
 # needing node_modules for CI-style checks.
 # ── Working directory ────────────────────────────────────────────────
 # /app holds the storkit source (copied in at build time for the binary).
 # /workspace is where the target project repo gets bind-mounted at runtime.
 WORKDIR /app
 # ── Build the storkit server binary ─────────────────────────────────
 # Copy the full project tree so `cargo build` and `npm run build` (via
 # build.rs) can produce the release binary with embedded frontend assets.
 COPY . .
 # Build frontend deps first (better layer caching)
 RUN cd frontend && npm ci
 # Build the release binary (build.rs runs npm run build for the frontend)
 RUN cargo build --release \
    && cp target/release/storkit /usr/local/bin/storkit
 # ── Runtime stage (smaller image) ───────────────────────────────────
 FROM debian:bookworm-slim AS runtime
 RUN apt-get update && apt-get install -y --no-install-recommends \
        git \
        curl \
        ca-certificates \
        libssl3 \
        # build-essential (gcc/cc) needed at runtime for:
        # - rebuild_and_restart (cargo build --release)
        # - agent-driven cargo commands (clippy, test, build)
        build-essential \
        pkg-config \
        libssl-dev \
    && rm -rf /var/lib/apt/lists/*
 # Node.js in runtime
 RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - \
    && apt-get install -y --no-install-recommends nodejs \
    && rm -rf /var/lib/apt/lists/*
 # Claude Code CLI in runtime
 RUN npm install -g @anthropic-ai/claude-code
 # Cargo and Rust toolchain needed at runtime for:
 # - rebuild_and_restart (cargo build inside the container)
 # - Agent-driven cargo commands (cargo clippy, cargo test, etc.)
 COPY --from=base /usr/local/cargo /usr/local/cargo
 COPY --from=base /usr/local/rustup /usr/local/rustup
 ENV PATH="/usr/local/cargo/bin:${PATH}"
 ENV RUSTUP_HOME="/usr/local/rustup"
 ENV CARGO_HOME="/usr/local/cargo"
 # cargo-nextest
 COPY --from=base /usr/local/bin/cargo-nextest /usr/local/bin/cargo-nextest
 # The storkit binary
 COPY --from=base /usr/local/bin/storkit /usr/local/bin/storkit
 # Copy the full source tree so rebuild_and_restart can do `cargo build`
 # from the workspace root (CARGO_MANIFEST_DIR is baked into the binary).
 # Alternative: mount the source as a volume.
 COPY --from=base /app /app
 WORKDIR /workspace
 # ── Ports ────────────────────────────────────────────────────────────
 # Web UI + MCP server
 EXPOSE 3001
 # ── Volumes (defined in docker-compose.yml) ──────────────────────────
 # /workspace           – bind mount: target project repo
 # /root/.claude        – named volume: Claude Code sessions/state
 # /usr/local/cargo/registry – named volume: cargo dependency cache
 # ── Entrypoint ───────────────────────────────────────────────────────
 # Run storkit against the bind-mounted project at /workspace.
 # The server picks up ANTHROPIC_API_KEY from the environment.
 CMD ["storkit", "/workspace"]
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -0,0 +1,93 @@
 # Story Kit – single-container deployment
 #
 # Usage:
 #   # Set your API key and project path, then:
 #   ANTHROPIC_API_KEY=sk-ant-... PROJECT_PATH=/path/to/your/repo \
 #     docker compose -f docker/docker-compose.yml up
 #
 # OrbStack users: just install OrbStack and use `docker compose` normally.
 # OrbStack's VirtioFS bind mount driver is significantly faster than
 # Docker Desktop's default (see spike findings).
 services:
  storkit:
    build:
      context: ..
      dockerfile: docker/Dockerfile
    container_name: storkit
    ports:
      # Web UI + MCP endpoint
      - "3001:3001"
    environment:
      # Required: Anthropic API key for Claude Code agents
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:?Set ANTHROPIC_API_KEY}
      # Optional: override the server port (default 3001)
      - STORKIT_PORT=3001
      # Optional: Matrix bot credentials (if using Matrix integration)
      - MATRIX_HOMESERVER=${MATRIX_HOMESERVER:-}
      - MATRIX_USER=${MATRIX_USER:-}
      - MATRIX_PASSWORD=${MATRIX_PASSWORD:-}
      # Optional: Slack webhook (if using Slack integration)
      - SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN:-}
      - SLACK_APP_TOKEN=${SLACK_APP_TOKEN:-}
    volumes:
      # The target project repo – bind-mounted from host.
      # Changes made by agents inside the container are immediately
      # visible on the host (and vice versa).
      - ${PROJECT_PATH:?Set PROJECT_PATH}:/workspace
      # Cargo registry cache – persists downloaded crates across
      # container restarts so `cargo build` doesn't re-download.
      - cargo-registry:/usr/local/cargo/registry
      # Cargo git checkouts – persists git-based dependencies.
      - cargo-git:/usr/local/cargo/git
      # Claude Code state – persists session history, projects config,
      # and conversation transcripts so --resume works across restarts.
      - claude-state:/root/.claude
      # Storkit source tree for rebuild_and_restart.
      # The binary has CARGO_MANIFEST_DIR baked in at compile time
      # pointing to /app/server, so the source must be at /app.
      # This is COPY'd in the Dockerfile; mounting over it allows
      # live source updates without rebuilding the image.
      # Mount host source so rebuild_and_restart picks up live changes:
      - ./..:/app
      # Keep cargo build artifacts off the bind mount.
      # Bind-mount directory traversal is ~23x slower than Docker volumes
      # (confirmed in spike 329). Cargo stat-checks every file in target/
      # on incremental builds — leaving it on the bind mount makes builds
      # catastrophically slow (~12s just to traverse the tree).
      - workspace-target:/workspace/target
      - storkit-target:/app/target
    # Resource limits – cap the whole system.
    # Adjust based on your machine. These are conservative defaults.
    deploy:
      resources:
        limits:
          cpus: "4"
          memory: 8G
        reservations:
          cpus: "1"
          memory: 2G
    # Health check – verify the MCP endpoint responds
    healthcheck:
      test: ["CMD", "curl", "-sf", "http://localhost:3001/health"]
      interval: 30s
      timeout: 5s
      retries: 3
      start_period: 10s
    # Restart policy – restart on crash but not on manual stop
    restart: unless-stopped
 volumes:
  cargo-registry:
  cargo-git:
  claude-state:
  workspace-target:
  storkit-target:
--- a/frontend/src/api/client.ts
+++ b/frontend/src/api/client.ts
@@ -115,6 +115,11 @@ export interface Message {
 	tool_call_id?: string;
 }
 export interface AnthropicModelInfo {
 	id: string;
 	context_window: number;
 }
 export interface WorkItemContent {
 	content: string;
 	stage: string;
@@ -266,7 +271,7 @@ export const api = {
 		return requestJson<boolean>("/anthropic/key/exists", {}, baseUrl);
 	},
 	getAnthropicModels(baseUrl?: string) {
-    return requestJson<string[]>("/anthropic/models", {}, baseUrl);
+		return requestJson<AnthropicModelInfo[]>("/anthropic/models", {}, baseUrl);
 	},
 	setAnthropicApiKey(api_key: string, baseUrl?: string) {
 		return requestJson<boolean>(
--- a/frontend/src/components/Chat.tsx
+++ b/frontend/src/components/Chat.tsx
@@ -4,7 +4,7 @@ import { Prism as SyntaxHighlighter } from "react-syntax-highlighter";
 import { oneDark } from "react-syntax-highlighter/dist/esm/styles/prism";
 import type { AgentConfigInfo } from "../api/agents";
 import { agentsApi } from "../api/agents";
-import type { PipelineState } from "../api/client";
+import type { AnthropicModelInfo, PipelineState } from "../api/client";
 import { api, ChatWebSocket } from "../api/client";
 import { useChatHistory } from "../hooks/useChatHistory";
 import type { Message, ProviderConfig } from "../types";
@@ -143,8 +143,13 @@ function formatToolActivity(toolName: string): string {
 const estimateTokens = (text: string): number => Math.ceil(text.length / 4);
-const getContextWindowSize = (modelName: string): number => {
+const getContextWindowSize = (
-	if (modelName.startsWith("claude-")) return 200000;
+	modelName: string,
 	claudeContextWindows?: Map<string, number>,
 ): number => {
 	if (modelName.startsWith("claude-")) {
 		return claudeContextWindows?.get(modelName) ?? 200000;
 	}
 	if (modelName.includes("llama3")) return 8192;
 	if (modelName.includes("qwen2.5")) return 32768;
 	if (modelName.includes("deepseek")) return 16384;
@@ -163,6 +168,9 @@ export function Chat({ projectPath, onCloseProject }: ChatProps) {
 	const [enableTools, setEnableTools] = useState(true);
 	const [availableModels, setAvailableModels] = useState<string[]>([]);
 	const [claudeModels, setClaudeModels] = useState<string[]>([]);
 	const [claudeContextWindowMap, setClaudeContextWindowMap] = useState<
 		Map<string, number>
 	>(new Map());
 	const [streamingContent, setStreamingContent] = useState("");
 	const [streamingThinking, setStreamingThinking] = useState("");
 	const [showApiKeyDialog, setShowApiKeyDialog] = useState(false);
@@ -285,7 +293,7 @@ export function Chat({ projectPath, onCloseProject }: ChatProps) {
 			totalTokens += estimateTokens(streamingContent);
 		}
-		const contextWindow = getContextWindowSize(model);
+		const contextWindow = getContextWindowSize(model, claudeContextWindowMap);
 		const percentage = Math.round((totalTokens / contextWindow) * 100);
 		return {
@@ -293,7 +301,7 @@ export function Chat({ projectPath, onCloseProject }: ChatProps) {
 			total: contextWindow,
 			percentage,
 		};
-	}, [messages, streamingContent, model]);
+	}, [messages, streamingContent, model, claudeContextWindowMap]);
 	useEffect(() => {
 		try {
@@ -337,14 +345,18 @@ export function Chat({ projectPath, onCloseProject }: ChatProps) {
 			.then((exists) => {
 				setHasAnthropicKey(exists);
 				if (!exists) return;
-				return api.getAnthropicModels().then((models) => {
+				return api.getAnthropicModels().then((models: AnthropicModelInfo[]) => {
 					if (models.length > 0) {
 						const sortedModels = models.sort((a, b) =>
-							a.toLowerCase().localeCompare(b.toLowerCase()),
+							a.id.toLowerCase().localeCompare(b.id.toLowerCase()),
 						);
 						setClaudeModels(sortedModels.map((m) => m.id));
 						setClaudeContextWindowMap(
 							new Map(sortedModels.map((m) => [m.id, m.context_window])),
 						);
 						setClaudeModels(sortedModels);
 					} else {
 						setClaudeModels([]);
 						setClaudeContextWindowMap(new Map());
 					}
 				});
 			})
--- a/script/release
+++ b/script/release
@@ -49,7 +49,16 @@ PACKAGE_JSON="${SCRIPT_DIR}/frontend/package.json"
 sed -i '' "s/\"version\": \".*\"/\"version\": \"${VERSION}\"/" "$PACKAGE_JSON"
 echo "==> Bumped ${PACKAGE_JSON} to ${VERSION}"
-git add "$CARGO_TOML" "$PACKAGE_JSON"
+# Regenerate lock files so they stay in sync with the version bump.
 CARGO_LOCK="${SCRIPT_DIR}/Cargo.lock"
 (cd "${SCRIPT_DIR}/server" && cargo generate-lockfile)
 echo "==> Regenerated Cargo.lock"
 PACKAGE_LOCK="${SCRIPT_DIR}/frontend/package-lock.json"
 (cd "${SCRIPT_DIR}/frontend" && npm install --package-lock-only --ignore-scripts --silent 2>/dev/null)
 echo "==> Regenerated package-lock.json"
 git add "$CARGO_TOML" "$CARGO_LOCK" "$PACKAGE_JSON" "$PACKAGE_LOCK"
 git commit -m "Bump version to ${VERSION}"
 if ! command -v cross >/dev/null 2>&1; then
@@ -188,20 +197,29 @@ git push origin "$TAG"
 # ── Create Gitea Release ──────────────────────────────────────
 echo "==> Creating release on Gitea..."
-RELEASE_JSON=$(python3 -c "
+RELEASE_JSON_FILE=$(mktemp)
 trap "rm -f '$RELEASE_JSON_FILE'" EXIT
 python3 -c "
 import json, sys
-print(json.dumps({
+with open(sys.argv[3], 'w') as f:
    json.dump({
        'tag_name': sys.argv[1],
        'name': sys.argv[1],
        'body': sys.argv[2]
-}))
+    }, f)
-" "$TAG" "$RELEASE_BODY")
+" "$TAG" "$RELEASE_BODY" "$RELEASE_JSON_FILE"
-RELEASE_RESPONSE=$(curl -sf -X POST \
+RELEASE_RESPONSE=$(curl -s --fail-with-body -X POST \
  -H "Authorization: token ${GITEA_TOKEN}" \
  -H "Content-Type: application/json" \
  "${GITEA_URL}/api/v1/repos/${REPO}/releases" \
-  -d "$RELEASE_JSON")
+  -d "@${RELEASE_JSON_FILE}")
 if [ $? -ne 0 ]; then
  echo "Error: Failed to create Gitea release."
  echo "Response: ${RELEASE_RESPONSE}"
  exit 1
 fi
 RELEASE_ID=$(echo "$RELEASE_RESPONSE" | python3 -c "import sys,json; print(json.load(sys.stdin)['id'])")
--- a/server/src/agents/mod.rs
+++ b/server/src/agents/mod.rs
@@ -2,7 +2,8 @@ pub mod gates;
 pub mod lifecycle;
 pub mod merge;
 mod pool;
-mod pty;
+pub(crate) mod pty;
 pub mod runtime;
 pub mod token_usage;
 use crate::config::AgentConfig;
--- a/server/src/agents/pool/mod.rs
+++ b/server/src/agents/pool/mod.rs
@@ -17,6 +17,7 @@ use super::{
    AgentEvent, AgentInfo, AgentStatus, CompletionReport, PipelineStage, agent_config_stage,
    pipeline_stage,
 };
 use super::runtime::{AgentRuntime, ClaudeCodeRuntime, GeminiRuntime, OpenAiRuntime, RuntimeContext};
 /// Build the composite key used to track agents in the pool.
 fn composite_key(story_id: &str, agent_name: &str) -> String {
@@ -513,25 +514,71 @@ impl AgentPool {
            });
            Self::notify_agent_state_changed(&watcher_tx_clone);
-            // Step 4: launch the agent process.
+            // Step 4: launch the agent process via the configured runtime.
-            match super::pty::run_agent_pty_streaming(
+            let runtime_name = config_clone
-                &sid,
+                .find_agent(&aname)
-                &aname,
+                .and_then(|a| a.runtime.as_deref())
-                &command,
+                .unwrap_or("claude-code");
-                &args,
+
-                &prompt,
+            let run_result = match runtime_name {
-                &wt_path_str,
+                "claude-code" => {
-                &tx_clone,
+                    let runtime = ClaudeCodeRuntime::new(child_killers_clone.clone());
-                &log_clone,
+                    let ctx = RuntimeContext {
-                log_writer_clone,
+                        story_id: sid.clone(),
                        agent_name: aname.clone(),
                        command,
                        args,
                        prompt,
                        cwd: wt_path_str,
                        inactivity_timeout_secs,
-                child_killers_clone,
+                        mcp_port: port_for_task,
-            )
+                    };
                    runtime
                        .start(ctx, tx_clone.clone(), log_clone.clone(), log_writer_clone)
                        .await
-            {
+                }
-                Ok(pty_result) => {
+                "gemini" => {
                    let runtime = GeminiRuntime::new();
                    let ctx = RuntimeContext {
                        story_id: sid.clone(),
                        agent_name: aname.clone(),
                        command,
                        args,
                        prompt,
                        cwd: wt_path_str,
                        inactivity_timeout_secs,
                        mcp_port: port_for_task,
                    };
                    runtime
                        .start(ctx, tx_clone.clone(), log_clone.clone(), log_writer_clone)
                        .await
                }
                "openai" => {
                    let runtime = OpenAiRuntime::new();
                    let ctx = RuntimeContext {
                        story_id: sid.clone(),
                        agent_name: aname.clone(),
                        command,
                        args,
                        prompt,
                        cwd: wt_path_str,
                        inactivity_timeout_secs,
                        mcp_port: port_for_task,
                    };
                    runtime
                        .start(ctx, tx_clone.clone(), log_clone.clone(), log_writer_clone)
                        .await
                }
                other => Err(format!(
                    "Unknown agent runtime '{other}'; check the 'runtime' field in project.toml. \
                     Supported: 'claude-code', 'gemini', 'openai'"
                )),
            };
            match run_result {
                Ok(result) => {
                    // Persist token usage if the agent reported it.
-                    if let Some(ref usage) = pty_result.token_usage
+                    if let Some(ref usage) = result.token_usage
                        && let Ok(agents) = agents_ref.lock()
                        && let Some(agent) = agents.get(&key_clone)
                        && let Some(ref pr) = agent.project_root
@@ -557,7 +604,7 @@ impl AgentPool {
                        port_for_task,
                        &sid,
                        &aname,
-                        pty_result.session_id,
+                        result.session_id,
                        watcher_tx_clone.clone(),
                    )
                    .await;
--- a/server/src/agents/pty.rs
+++ b/server/src/agents/pty.rs
@@ -11,7 +11,7 @@ use crate::slog;
 use crate::slog_warn;
 /// Result from a PTY agent session, containing the session ID and token usage.
-pub(super) struct PtyResult {
+pub(in crate::agents) struct PtyResult {
    pub session_id: Option<String>,
    pub token_usage: Option<TokenUsage>,
 }
@@ -35,7 +35,7 @@ impl Drop for ChildKillerGuard {
 /// Spawn claude agent in a PTY and stream events through the broadcast channel.
 #[allow(clippy::too_many_arguments)]
-pub(super) async fn run_agent_pty_streaming(
+pub(in crate::agents) async fn run_agent_pty_streaming(
    story_id: &str,
    agent_name: &str,
    command: &str,
--- a/server/src/agents/runtime/claude_code.rs
+++ b/server/src/agents/runtime/claude_code.rs
@@ -0,0 +1,66 @@
 use std::collections::HashMap;
 use std::sync::{Arc, Mutex};
 use portable_pty::ChildKiller;
 use tokio::sync::broadcast;
 use crate::agent_log::AgentLogWriter;
 use super::{AgentEvent, AgentRuntime, RuntimeContext, RuntimeResult, RuntimeStatus};
 /// Agent runtime that spawns the `claude` CLI in a PTY and streams JSON events.
 ///
 /// This is the default runtime (`runtime = "claude-code"` in project.toml).
 /// It wraps the existing PTY-based execution logic, preserving all streaming,
 /// token tracking, and inactivity timeout behaviour.
 pub struct ClaudeCodeRuntime {
    child_killers: Arc<Mutex<HashMap<String, Box<dyn ChildKiller + Send + Sync>>>>,
 }
 impl ClaudeCodeRuntime {
    pub fn new(
        child_killers: Arc<Mutex<HashMap<String, Box<dyn ChildKiller + Send + Sync>>>>,
    ) -> Self {
        Self { child_killers }
    }
 }
 impl AgentRuntime for ClaudeCodeRuntime {
    async fn start(
        &self,
        ctx: RuntimeContext,
        tx: broadcast::Sender<AgentEvent>,
        event_log: Arc<Mutex<Vec<AgentEvent>>>,
        log_writer: Option<Arc<Mutex<AgentLogWriter>>>,
    ) -> Result<RuntimeResult, String> {
        let pty_result = super::super::pty::run_agent_pty_streaming(
            &ctx.story_id,
            &ctx.agent_name,
            &ctx.command,
            &ctx.args,
            &ctx.prompt,
            &ctx.cwd,
            &tx,
            &event_log,
            log_writer,
            ctx.inactivity_timeout_secs,
            Arc::clone(&self.child_killers),
        )
        .await?;
        Ok(RuntimeResult {
            session_id: pty_result.session_id,
            token_usage: pty_result.token_usage,
        })
    }
    fn stop(&self) {
        // Stopping is handled externally by the pool via kill_child_for_key().
        // The ChildKillerGuard in pty.rs deregisters automatically on process exit.
    }
    fn get_status(&self) -> RuntimeStatus {
        // Lifecycle status is tracked by the pool; the runtime itself is stateless.
        RuntimeStatus::Idle
    }
 }
--- a/server/src/agents/runtime/gemini.rs
+++ b/server/src/agents/runtime/gemini.rs
@@ -0,0 +1,809 @@
 use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::{Arc, Mutex};
 use reqwest::Client;
 use serde::{Deserialize, Serialize};
 use serde_json::{json, Value};
 use tokio::sync::broadcast;
 use crate::agent_log::AgentLogWriter;
 use crate::slog;
 use super::super::{AgentEvent, TokenUsage};
 use super::{AgentRuntime, RuntimeContext, RuntimeResult, RuntimeStatus};
 // ── Public runtime struct ────────────────────────────────────────────
 /// Agent runtime that drives a Gemini model through the Google AI
 /// `generateContent` REST API.
 ///
 /// The runtime:
 /// 1. Fetches MCP tool definitions from storkit's MCP server.
 /// 2. Converts them to Gemini function-calling format.
 /// 3. Sends the agent prompt + tools to the Gemini API.
 /// 4. Executes any requested function calls via MCP `tools/call`.
 /// 5. Loops until the model produces a text-only response or an error.
 /// 6. Tracks token usage from the API response metadata.
 pub struct GeminiRuntime {
    /// Whether a stop has been requested.
    cancelled: Arc<AtomicBool>,
 }
 impl GeminiRuntime {
    pub fn new() -> Self {
        Self {
            cancelled: Arc::new(AtomicBool::new(false)),
        }
    }
 }
 impl AgentRuntime for GeminiRuntime {
    async fn start(
        &self,
        ctx: RuntimeContext,
        tx: broadcast::Sender<AgentEvent>,
        event_log: Arc<Mutex<Vec<AgentEvent>>>,
        log_writer: Option<Arc<Mutex<AgentLogWriter>>>,
    ) -> Result<RuntimeResult, String> {
        let api_key = std::env::var("GOOGLE_AI_API_KEY").map_err(|_| {
            "GOOGLE_AI_API_KEY environment variable is not set. \
             Set it to your Google AI API key to use the Gemini runtime."
                .to_string()
        })?;
        let model = if ctx.command.starts_with("gemini") {
            // The pool puts the model into `command` for non-CLI runtimes,
            // but also check args for a --model flag.
            ctx.command.clone()
        } else {
            // Fall back to args: look for --model <value>
            ctx.args
                .iter()
                .position(|a| a == "--model")
                .and_then(|i| ctx.args.get(i + 1))
                .cloned()
                .unwrap_or_else(|| "gemini-2.5-pro".to_string())
        };
        let mcp_port = ctx.mcp_port;
        let mcp_base = format!("http://localhost:{mcp_port}/mcp");
        let client = Client::new();
        let cancelled = Arc::clone(&self.cancelled);
        // Step 1: Fetch MCP tool definitions and convert to Gemini format.
        let gemini_tools = fetch_and_convert_mcp_tools(&client, &mcp_base).await?;
        // Step 2: Build the initial conversation contents.
        let system_instruction = build_system_instruction(&ctx);
        let mut contents: Vec<Value> = vec![json!({
            "role": "user",
            "parts": [{ "text": ctx.prompt }]
        })];
        let mut total_usage = TokenUsage {
            input_tokens: 0,
            output_tokens: 0,
            cache_creation_input_tokens: 0,
            cache_read_input_tokens: 0,
            total_cost_usd: 0.0,
        };
        let emit = |event: AgentEvent| {
            super::super::pty::emit_event(
                event,
                &tx,
                &event_log,
                log_writer.as_ref().map(|w| w.as_ref()),
            );
        };
        emit(AgentEvent::Status {
            story_id: ctx.story_id.clone(),
            agent_name: ctx.agent_name.clone(),
            status: "running".to_string(),
        });
        // Step 3: Conversation loop.
        let mut turn = 0u32;
        let max_turns = 200; // Safety limit
        loop {
            if cancelled.load(Ordering::Relaxed) {
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: "Agent was stopped by user".to_string(),
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            turn += 1;
            if turn > max_turns {
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: format!("Exceeded maximum turns ({max_turns})"),
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            slog!("[gemini] Turn {turn} for {}:{}", ctx.story_id, ctx.agent_name);
            let request_body = build_generate_content_request(
                &system_instruction,
                &contents,
                &gemini_tools,
            );
            let url = format!(
                "https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent?key={api_key}"
            );
            let response = client
                .post(&url)
                .json(&request_body)
                .send()
                .await
                .map_err(|e| format!("Gemini API request failed: {e}"))?;
            let status = response.status();
            let body: Value = response
                .json()
                .await
                .map_err(|e| format!("Failed to parse Gemini API response: {e}"))?;
            if !status.is_success() {
                let error_msg = body["error"]["message"]
                    .as_str()
                    .unwrap_or("Unknown API error");
                let err = format!("Gemini API error ({status}): {error_msg}");
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: err.clone(),
                });
                return Err(err);
            }
            // Accumulate token usage.
            if let Some(usage) = parse_usage_metadata(&body) {
                total_usage.input_tokens += usage.input_tokens;
                total_usage.output_tokens += usage.output_tokens;
            }
            // Extract the candidate response.
            let candidate = body["candidates"]
                .as_array()
                .and_then(|c| c.first())
                .ok_or_else(|| "No candidates in Gemini response".to_string())?;
            let parts = candidate["content"]["parts"]
                .as_array()
                .ok_or_else(|| "No parts in Gemini response candidate".to_string())?;
            // Check finish reason.
            let finish_reason = candidate["finishReason"].as_str().unwrap_or("");
            // Separate text parts and function call parts.
            let mut text_parts: Vec<String> = Vec::new();
            let mut function_calls: Vec<GeminiFunctionCall> = Vec::new();
            for part in parts {
                if let Some(text) = part["text"].as_str() {
                    text_parts.push(text.to_string());
                }
                if let Some(fc) = part.get("functionCall")
                    && let (Some(name), Some(args)) =
                        (fc["name"].as_str(), fc.get("args"))
                {
                    function_calls.push(GeminiFunctionCall {
                        name: name.to_string(),
                        args: args.clone(),
                    });
                }
            }
            // Emit any text output.
            for text in &text_parts {
                if !text.is_empty() {
                    emit(AgentEvent::Output {
                        story_id: ctx.story_id.clone(),
                        agent_name: ctx.agent_name.clone(),
                        text: text.clone(),
                    });
                }
            }
            // If no function calls, the model is done.
            if function_calls.is_empty() {
                emit(AgentEvent::Done {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    session_id: None,
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            // Add the model's response to the conversation.
            let model_parts: Vec<Value> = parts.to_vec();
            contents.push(json!({
                "role": "model",
                "parts": model_parts
            }));
            // Execute function calls via MCP and build response parts.
            let mut response_parts: Vec<Value> = Vec::new();
            for fc in &function_calls {
                if cancelled.load(Ordering::Relaxed) {
                    break;
                }
                slog!(
                    "[gemini] Calling MCP tool '{}' for {}:{}",
                    fc.name,
                    ctx.story_id,
                    ctx.agent_name
                );
                emit(AgentEvent::Output {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    text: format!("\n[Tool call: {}]\n", fc.name),
                });
                let tool_result =
                    call_mcp_tool(&client, &mcp_base, &fc.name, &fc.args).await;
                let response_value = match &tool_result {
                    Ok(result) => {
                        emit(AgentEvent::Output {
                            story_id: ctx.story_id.clone(),
                            agent_name: ctx.agent_name.clone(),
                            text: format!(
                                "[Tool result: {} chars]\n",
                                result.len()
                            ),
                        });
                        json!({ "result": result })
                    }
                    Err(e) => {
                        emit(AgentEvent::Output {
                            story_id: ctx.story_id.clone(),
                            agent_name: ctx.agent_name.clone(),
                            text: format!("[Tool error: {e}]\n"),
                        });
                        json!({ "error": e })
                    }
                };
                response_parts.push(json!({
                    "functionResponse": {
                        "name": fc.name,
                        "response": response_value
                    }
                }));
            }
            // Add function responses to the conversation.
            contents.push(json!({
                "role": "user",
                "parts": response_parts
            }));
            // If the model indicated it's done despite having function calls,
            // respect the finish reason.
            if finish_reason == "STOP" && function_calls.is_empty() {
                break;
            }
        }
        emit(AgentEvent::Done {
            story_id: ctx.story_id.clone(),
            agent_name: ctx.agent_name.clone(),
            session_id: None,
        });
        Ok(RuntimeResult {
            session_id: None,
            token_usage: Some(total_usage),
        })
    }
    fn stop(&self) {
        self.cancelled.store(true, Ordering::Relaxed);
    }
    fn get_status(&self) -> RuntimeStatus {
        if self.cancelled.load(Ordering::Relaxed) {
            RuntimeStatus::Failed
        } else {
            RuntimeStatus::Idle
        }
    }
 }
 // ── Internal types ───────────────────────────────────────────────────
 struct GeminiFunctionCall {
    name: String,
    args: Value,
 }
 // ── Gemini API types (for serde) ─────────────────────────────────────
 #[derive(Debug, Serialize, Deserialize)]
 struct GeminiFunctionDeclaration {
    name: String,
    description: String,
    #[serde(skip_serializing_if = "Option::is_none")]
    parameters: Option<Value>,
 }
 // ── Helper functions ─────────────────────────────────────────────────
 /// Build the system instruction content from the RuntimeContext.
 fn build_system_instruction(ctx: &RuntimeContext) -> Value {
    // Use system_prompt from args if provided via --append-system-prompt,
    // otherwise use a sensible default.
    let system_text = ctx
        .args
        .iter()
        .position(|a| a == "--append-system-prompt")
        .and_then(|i| ctx.args.get(i + 1))
        .cloned()
        .unwrap_or_else(|| {
            format!(
                "You are an AI coding agent working on story {}. \
                 You have access to tools via function calling. \
                 Use them to complete the task. \
                 Work in the directory: {}",
                ctx.story_id, ctx.cwd
            )
        });
    json!({
        "parts": [{ "text": system_text }]
    })
 }
 /// Build the full `generateContent` request body.
 fn build_generate_content_request(
    system_instruction: &Value,
    contents: &[Value],
    gemini_tools: &[GeminiFunctionDeclaration],
 ) -> Value {
    let mut body = json!({
        "system_instruction": system_instruction,
        "contents": contents,
        "generationConfig": {
            "temperature": 0.2,
            "maxOutputTokens": 65536,
        }
    });
    if !gemini_tools.is_empty() {
        body["tools"] = json!([{
            "functionDeclarations": gemini_tools
        }]);
    }
    body
 }
 /// Fetch MCP tool definitions from storkit's MCP server and convert
 /// them to Gemini function declaration format.
 async fn fetch_and_convert_mcp_tools(
    client: &Client,
    mcp_base: &str,
 ) -> Result<Vec<GeminiFunctionDeclaration>, String> {
    let request = json!({
        "jsonrpc": "2.0",
        "id": 1,
        "method": "tools/list",
        "params": {}
    });
    let response = client
        .post(mcp_base)
        .json(&request)
        .send()
        .await
        .map_err(|e| format!("Failed to fetch MCP tools: {e}"))?;
    let body: Value = response
        .json()
        .await
        .map_err(|e| format!("Failed to parse MCP tools response: {e}"))?;
    let tools = body["result"]["tools"]
        .as_array()
        .ok_or_else(|| "No tools array in MCP response".to_string())?;
    let mut declarations = Vec::new();
    for tool in tools {
        let name = tool["name"].as_str().unwrap_or("").to_string();
        let description = tool["description"].as_str().unwrap_or("").to_string();
        if name.is_empty() {
            continue;
        }
        // Convert MCP inputSchema (JSON Schema) to Gemini parameters
        // (OpenAPI-subset schema). They are structurally compatible for
        // simple object schemas.
        let parameters = convert_mcp_schema_to_gemini(tool.get("inputSchema"));
        declarations.push(GeminiFunctionDeclaration {
            name,
            description,
            parameters,
        });
    }
    slog!("[gemini] Loaded {} MCP tools as function declarations", declarations.len());
    Ok(declarations)
 }
 /// Convert an MCP inputSchema (JSON Schema) to a Gemini-compatible
 /// OpenAPI-subset parameter schema.
 ///
 /// Gemini function calling expects parameters in OpenAPI format, which
 /// is structurally similar to JSON Schema for simple object types.
 /// We strip unsupported fields and ensure the type is "object".
 fn convert_mcp_schema_to_gemini(schema: Option<&Value>) -> Option<Value> {
    let schema = schema?;
    // If the schema has no properties (empty tool), return None.
    let properties = schema.get("properties")?;
    if properties.as_object().is_some_and(|p| p.is_empty()) {
        return None;
    }
    let mut result = json!({
        "type": "object",
        "properties": clean_schema_properties(properties),
    });
    // Preserve required fields if present.
    if let Some(required) = schema.get("required") {
        result["required"] = required.clone();
    }
    Some(result)
 }
 /// Recursively clean schema properties to be Gemini-compatible.
 /// Removes unsupported JSON Schema keywords.
 fn clean_schema_properties(properties: &Value) -> Value {
    let Some(obj) = properties.as_object() else {
        return properties.clone();
    };
    let mut cleaned = serde_json::Map::new();
    for (key, value) in obj {
        let mut prop = value.clone();
        // Remove JSON Schema keywords not supported by Gemini
        if let Some(p) = prop.as_object_mut() {
            p.remove("$schema");
            p.remove("additionalProperties");
            // Recursively clean nested object properties
            if let Some(nested_props) = p.get("properties").cloned() {
                p.insert(
                    "properties".to_string(),
                    clean_schema_properties(&nested_props),
                );
            }
            // Clean items schema for arrays
            if let Some(items) = p.get("items").cloned()
                && let Some(items_obj) = items.as_object()
            {
                let mut cleaned_items = items_obj.clone();
                cleaned_items.remove("$schema");
                cleaned_items.remove("additionalProperties");
                p.insert("items".to_string(), Value::Object(cleaned_items));
            }
        }
        cleaned.insert(key.clone(), prop);
    }
    Value::Object(cleaned)
 }
 /// Call an MCP tool via storkit's MCP server.
 async fn call_mcp_tool(
    client: &Client,
    mcp_base: &str,
    tool_name: &str,
    args: &Value,
 ) -> Result<String, String> {
    let request = json!({
        "jsonrpc": "2.0",
        "id": 1,
        "method": "tools/call",
        "params": {
            "name": tool_name,
            "arguments": args
        }
    });
    let response = client
        .post(mcp_base)
        .json(&request)
        .send()
        .await
        .map_err(|e| format!("MCP tool call failed: {e}"))?;
    let body: Value = response
        .json()
        .await
        .map_err(|e| format!("Failed to parse MCP tool response: {e}"))?;
    if let Some(error) = body.get("error") {
        let msg = error["message"].as_str().unwrap_or("Unknown MCP error");
        return Err(format!("MCP tool '{tool_name}' error: {msg}"));
    }
    // MCP tools/call returns { result: { content: [{ type: "text", text: "..." }] } }
    let content = &body["result"]["content"];
    if let Some(arr) = content.as_array() {
        let texts: Vec<&str> = arr
            .iter()
            .filter_map(|c| c["text"].as_str())
            .collect();
        if !texts.is_empty() {
            return Ok(texts.join("\n"));
        }
    }
    // Fall back to serializing the entire result.
    Ok(body["result"].to_string())
 }
 /// Parse token usage metadata from a Gemini API response.
 fn parse_usage_metadata(response: &Value) -> Option<TokenUsage> {
    let metadata = response.get("usageMetadata")?;
    Some(TokenUsage {
        input_tokens: metadata
            .get("promptTokenCount")
            .and_then(|v| v.as_u64())
            .unwrap_or(0),
        output_tokens: metadata
            .get("candidatesTokenCount")
            .and_then(|v| v.as_u64())
            .unwrap_or(0),
        // Gemini doesn't have cache token fields, but we keep the struct uniform.
        cache_creation_input_tokens: 0,
        cache_read_input_tokens: 0,
        // Google AI API doesn't report cost; leave at 0.
        total_cost_usd: 0.0,
    })
 }
 // ── Tests ────────────────────────────────────────────────────────────
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn convert_mcp_schema_simple_object() {
        let schema = json!({
            "type": "object",
            "properties": {
                "story_id": {
                    "type": "string",
                    "description": "Story identifier"
                }
            },
            "required": ["story_id"]
        });
        let result = convert_mcp_schema_to_gemini(Some(&schema)).unwrap();
        assert_eq!(result["type"], "object");
        assert!(result["properties"]["story_id"].is_object());
        assert_eq!(result["required"][0], "story_id");
    }
    #[test]
    fn convert_mcp_schema_empty_properties_returns_none() {
        let schema = json!({
            "type": "object",
            "properties": {}
        });
        assert!(convert_mcp_schema_to_gemini(Some(&schema)).is_none());
    }
    #[test]
    fn convert_mcp_schema_none_returns_none() {
        assert!(convert_mcp_schema_to_gemini(None).is_none());
    }
    #[test]
    fn convert_mcp_schema_strips_additional_properties() {
        let schema = json!({
            "type": "object",
            "properties": {
                "name": {
                    "type": "string",
                    "additionalProperties": false,
                    "$schema": "http://json-schema.org/draft-07/schema#"
                }
            }
        });
        let result = convert_mcp_schema_to_gemini(Some(&schema)).unwrap();
        let name_prop = &result["properties"]["name"];
        assert!(name_prop.get("additionalProperties").is_none());
        assert!(name_prop.get("$schema").is_none());
        assert_eq!(name_prop["type"], "string");
    }
    #[test]
    fn convert_mcp_schema_with_nested_objects() {
        let schema = json!({
            "type": "object",
            "properties": {
                "config": {
                    "type": "object",
                    "properties": {
                        "key": { "type": "string" }
                    }
                }
            }
        });
        let result = convert_mcp_schema_to_gemini(Some(&schema)).unwrap();
        assert!(result["properties"]["config"]["properties"]["key"].is_object());
    }
    #[test]
    fn convert_mcp_schema_with_array_items() {
        let schema = json!({
            "type": "object",
            "properties": {
                "items": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": { "type": "string" }
                        },
                        "additionalProperties": false
                    }
                }
            }
        });
        let result = convert_mcp_schema_to_gemini(Some(&schema)).unwrap();
        let items_schema = &result["properties"]["items"]["items"];
        assert!(items_schema.get("additionalProperties").is_none());
    }
    #[test]
    fn build_system_instruction_uses_args() {
        let ctx = RuntimeContext {
            story_id: "42_story_test".to_string(),
            agent_name: "coder-1".to_string(),
            command: "gemini-2.5-pro".to_string(),
            args: vec![
                "--append-system-prompt".to_string(),
                "Custom system prompt".to_string(),
            ],
            prompt: "Do the thing".to_string(),
            cwd: "/tmp/wt".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        let instruction = build_system_instruction(&ctx);
        assert_eq!(instruction["parts"][0]["text"], "Custom system prompt");
    }
    #[test]
    fn build_system_instruction_default() {
        let ctx = RuntimeContext {
            story_id: "42_story_test".to_string(),
            agent_name: "coder-1".to_string(),
            command: "gemini-2.5-pro".to_string(),
            args: vec![],
            prompt: "Do the thing".to_string(),
            cwd: "/tmp/wt".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        let instruction = build_system_instruction(&ctx);
        let text = instruction["parts"][0]["text"].as_str().unwrap();
        assert!(text.contains("42_story_test"));
        assert!(text.contains("/tmp/wt"));
    }
    #[test]
    fn build_generate_content_request_includes_tools() {
        let system = json!({"parts": [{"text": "system"}]});
        let contents = vec![json!({"role": "user", "parts": [{"text": "hello"}]})];
        let tools = vec![GeminiFunctionDeclaration {
            name: "my_tool".to_string(),
            description: "A tool".to_string(),
            parameters: Some(json!({"type": "object", "properties": {"x": {"type": "string"}}})),
        }];
        let body = build_generate_content_request(&system, &contents, &tools);
        assert!(body["tools"][0]["functionDeclarations"].is_array());
        assert_eq!(body["tools"][0]["functionDeclarations"][0]["name"], "my_tool");
    }
    #[test]
    fn build_generate_content_request_no_tools() {
        let system = json!({"parts": [{"text": "system"}]});
        let contents = vec![json!({"role": "user", "parts": [{"text": "hello"}]})];
        let tools: Vec<GeminiFunctionDeclaration> = vec![];
        let body = build_generate_content_request(&system, &contents, &tools);
        assert!(body.get("tools").is_none());
    }
    #[test]
    fn parse_usage_metadata_valid() {
        let response = json!({
            "usageMetadata": {
                "promptTokenCount": 100,
                "candidatesTokenCount": 50,
                "totalTokenCount": 150
            }
        });
        let usage = parse_usage_metadata(&response).unwrap();
        assert_eq!(usage.input_tokens, 100);
        assert_eq!(usage.output_tokens, 50);
        assert_eq!(usage.cache_creation_input_tokens, 0);
        assert_eq!(usage.total_cost_usd, 0.0);
    }
    #[test]
    fn parse_usage_metadata_missing() {
        let response = json!({"candidates": []});
        assert!(parse_usage_metadata(&response).is_none());
    }
    #[test]
    fn gemini_runtime_stop_sets_cancelled() {
        let runtime = GeminiRuntime::new();
        assert_eq!(runtime.get_status(), RuntimeStatus::Idle);
        runtime.stop();
        assert_eq!(runtime.get_status(), RuntimeStatus::Failed);
    }
    #[test]
    fn model_extraction_from_command() {
        // When command starts with "gemini", use it as model name
        let ctx = RuntimeContext {
            story_id: "1".to_string(),
            agent_name: "coder".to_string(),
            command: "gemini-2.5-pro".to_string(),
            args: vec![],
            prompt: "test".to_string(),
            cwd: "/tmp".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        // The model extraction logic is inside start(), but we test the
        // condition here.
        assert!(ctx.command.starts_with("gemini"));
    }
 }
--- a/server/src/agents/runtime/mod.rs
+++ b/server/src/agents/runtime/mod.rs
@@ -0,0 +1,159 @@
 mod claude_code;
 mod gemini;
 mod openai;
 pub use claude_code::ClaudeCodeRuntime;
 pub use gemini::GeminiRuntime;
 pub use openai::OpenAiRuntime;
 use std::sync::{Arc, Mutex};
 use tokio::sync::broadcast;
 use crate::agent_log::AgentLogWriter;
 use super::{AgentEvent, TokenUsage};
 /// Context passed to a runtime when launching an agent session.
 pub struct RuntimeContext {
    pub story_id: String,
    pub agent_name: String,
    pub command: String,
    pub args: Vec<String>,
    pub prompt: String,
    pub cwd: String,
    pub inactivity_timeout_secs: u64,
    /// Port of the storkit MCP server, used by API-based runtimes (Gemini, OpenAI)
    /// to call back for tool execution.
    pub mcp_port: u16,
 }
 /// Result returned by a runtime after the agent session completes.
 pub struct RuntimeResult {
    pub session_id: Option<String>,
    pub token_usage: Option<TokenUsage>,
 }
 /// Runtime status reported by the backend.
 #[derive(Debug, Clone, PartialEq)]
 #[allow(dead_code)]
 pub enum RuntimeStatus {
    Idle,
    Running,
    Completed,
    Failed,
 }
 /// Abstraction over different agent execution backends.
 ///
 /// Implementations:
 /// - [`ClaudeCodeRuntime`]: spawns the `claude` CLI via a PTY (default, `runtime = "claude-code"`)
 ///
 /// Future implementations could include OpenAI and Gemini API runtimes.
 #[allow(dead_code)]
 pub trait AgentRuntime: Send + Sync {
    /// Start the agent and drive it to completion, streaming events through
    /// the provided broadcast sender and event log.
    ///
    /// Returns when the agent session finishes (success or error).
    async fn start(
        &self,
        ctx: RuntimeContext,
        tx: broadcast::Sender<AgentEvent>,
        event_log: Arc<Mutex<Vec<AgentEvent>>>,
        log_writer: Option<Arc<Mutex<AgentLogWriter>>>,
    ) -> Result<RuntimeResult, String>;
    /// Stop the running agent.
    fn stop(&self);
    /// Get the current runtime status.
    fn get_status(&self) -> RuntimeStatus;
    /// Return any events buffered outside the broadcast channel.
    ///
    /// PTY-based runtimes stream directly to the broadcast channel; this
    /// returns empty by default. API-based runtimes may buffer events here.
    fn stream_events(&self) -> Vec<AgentEvent> {
        vec![]
    }
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn runtime_context_fields() {
        let ctx = RuntimeContext {
            story_id: "42_story_foo".to_string(),
            agent_name: "coder-1".to_string(),
            command: "claude".to_string(),
            args: vec!["--model".to_string(), "sonnet".to_string()],
            prompt: "Do the thing".to_string(),
            cwd: "/tmp/wt".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        assert_eq!(ctx.story_id, "42_story_foo");
        assert_eq!(ctx.agent_name, "coder-1");
        assert_eq!(ctx.command, "claude");
        assert_eq!(ctx.args.len(), 2);
        assert_eq!(ctx.prompt, "Do the thing");
        assert_eq!(ctx.cwd, "/tmp/wt");
        assert_eq!(ctx.inactivity_timeout_secs, 300);
        assert_eq!(ctx.mcp_port, 3001);
    }
    #[test]
    fn runtime_result_fields() {
        let result = RuntimeResult {
            session_id: Some("sess-123".to_string()),
            token_usage: Some(TokenUsage {
                input_tokens: 100,
                output_tokens: 50,
                cache_creation_input_tokens: 0,
                cache_read_input_tokens: 0,
                total_cost_usd: 0.01,
            }),
        };
        assert_eq!(result.session_id, Some("sess-123".to_string()));
        assert!(result.token_usage.is_some());
        let usage = result.token_usage.unwrap();
        assert_eq!(usage.input_tokens, 100);
        assert_eq!(usage.output_tokens, 50);
        assert_eq!(usage.total_cost_usd, 0.01);
    }
    #[test]
    fn runtime_result_no_usage() {
        let result = RuntimeResult {
            session_id: None,
            token_usage: None,
        };
        assert!(result.session_id.is_none());
        assert!(result.token_usage.is_none());
    }
    #[test]
    fn runtime_status_variants() {
        assert_eq!(RuntimeStatus::Idle, RuntimeStatus::Idle);
        assert_ne!(RuntimeStatus::Running, RuntimeStatus::Completed);
        assert_ne!(RuntimeStatus::Failed, RuntimeStatus::Idle);
    }
    #[test]
    fn claude_code_runtime_get_status_returns_idle() {
        use std::collections::HashMap;
        let killers = Arc::new(Mutex::new(HashMap::new()));
        let runtime = ClaudeCodeRuntime::new(killers);
        assert_eq!(runtime.get_status(), RuntimeStatus::Idle);
    }
    #[test]
    fn claude_code_runtime_stream_events_empty() {
        use std::collections::HashMap;
        let killers = Arc::new(Mutex::new(HashMap::new()));
        let runtime = ClaudeCodeRuntime::new(killers);
        assert!(runtime.stream_events().is_empty());
    }
 }
--- a/server/src/agents/runtime/openai.rs
+++ b/server/src/agents/runtime/openai.rs
@@ -0,0 +1,704 @@
 use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::{Arc, Mutex};
 use reqwest::Client;
 use serde_json::{json, Value};
 use tokio::sync::broadcast;
 use crate::agent_log::AgentLogWriter;
 use crate::slog;
 use super::super::{AgentEvent, TokenUsage};
 use super::{AgentRuntime, RuntimeContext, RuntimeResult, RuntimeStatus};
 // ── Public runtime struct ────────────────────────────────────────────
 /// Agent runtime that drives an OpenAI model (GPT-4o, o3, etc.) through
 /// the OpenAI Chat Completions API.
 ///
 /// The runtime:
 /// 1. Fetches MCP tool definitions from storkit's MCP server.
 /// 2. Converts them to OpenAI function-calling format.
 /// 3. Sends the agent prompt + tools to the Chat Completions API.
 /// 4. Executes any requested tool calls via MCP `tools/call`.
 /// 5. Loops until the model produces a response with no tool calls.
 /// 6. Tracks token usage from the API response.
 pub struct OpenAiRuntime {
    /// Whether a stop has been requested.
    cancelled: Arc<AtomicBool>,
 }
 impl OpenAiRuntime {
    pub fn new() -> Self {
        Self {
            cancelled: Arc::new(AtomicBool::new(false)),
        }
    }
 }
 impl AgentRuntime for OpenAiRuntime {
    async fn start(
        &self,
        ctx: RuntimeContext,
        tx: broadcast::Sender<AgentEvent>,
        event_log: Arc<Mutex<Vec<AgentEvent>>>,
        log_writer: Option<Arc<Mutex<AgentLogWriter>>>,
    ) -> Result<RuntimeResult, String> {
        let api_key = std::env::var("OPENAI_API_KEY").map_err(|_| {
            "OPENAI_API_KEY environment variable is not set. \
             Set it to your OpenAI API key to use the OpenAI runtime."
                .to_string()
        })?;
        let model = if ctx.command.starts_with("gpt") || ctx.command.starts_with("o") {
            // The pool puts the model into `command` for non-CLI runtimes.
            ctx.command.clone()
        } else {
            // Fall back to args: look for --model <value>
            ctx.args
                .iter()
                .position(|a| a == "--model")
                .and_then(|i| ctx.args.get(i + 1))
                .cloned()
                .unwrap_or_else(|| "gpt-4o".to_string())
        };
        let mcp_port = ctx.mcp_port;
        let mcp_base = format!("http://localhost:{mcp_port}/mcp");
        let client = Client::new();
        let cancelled = Arc::clone(&self.cancelled);
        // Step 1: Fetch MCP tool definitions and convert to OpenAI format.
        let openai_tools = fetch_and_convert_mcp_tools(&client, &mcp_base).await?;
        // Step 2: Build the initial conversation messages.
        let system_text = build_system_text(&ctx);
        let mut messages: Vec<Value> = vec![
            json!({ "role": "system", "content": system_text }),
            json!({ "role": "user", "content": ctx.prompt }),
        ];
        let mut total_usage = TokenUsage {
            input_tokens: 0,
            output_tokens: 0,
            cache_creation_input_tokens: 0,
            cache_read_input_tokens: 0,
            total_cost_usd: 0.0,
        };
        let emit = |event: AgentEvent| {
            super::super::pty::emit_event(
                event,
                &tx,
                &event_log,
                log_writer.as_ref().map(|w| w.as_ref()),
            );
        };
        emit(AgentEvent::Status {
            story_id: ctx.story_id.clone(),
            agent_name: ctx.agent_name.clone(),
            status: "running".to_string(),
        });
        // Step 3: Conversation loop.
        let mut turn = 0u32;
        let max_turns = 200; // Safety limit
        loop {
            if cancelled.load(Ordering::Relaxed) {
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: "Agent was stopped by user".to_string(),
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            turn += 1;
            if turn > max_turns {
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: format!("Exceeded maximum turns ({max_turns})"),
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            slog!(
                "[openai] Turn {turn} for {}:{}",
                ctx.story_id,
                ctx.agent_name
            );
            let mut request_body = json!({
                "model": model,
                "messages": messages,
                "temperature": 0.2,
            });
            if !openai_tools.is_empty() {
                request_body["tools"] = json!(openai_tools);
            }
            let response = client
                .post("https://api.openai.com/v1/chat/completions")
                .bearer_auth(&api_key)
                .json(&request_body)
                .send()
                .await
                .map_err(|e| format!("OpenAI API request failed: {e}"))?;
            let status = response.status();
            let body: Value = response
                .json()
                .await
                .map_err(|e| format!("Failed to parse OpenAI API response: {e}"))?;
            if !status.is_success() {
                let error_msg = body["error"]["message"]
                    .as_str()
                    .unwrap_or("Unknown API error");
                let err = format!("OpenAI API error ({status}): {error_msg}");
                emit(AgentEvent::Error {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    message: err.clone(),
                });
                return Err(err);
            }
            // Accumulate token usage.
            if let Some(usage) = parse_usage(&body) {
                total_usage.input_tokens += usage.input_tokens;
                total_usage.output_tokens += usage.output_tokens;
            }
            // Extract the first choice.
            let choice = body["choices"]
                .as_array()
                .and_then(|c| c.first())
                .ok_or_else(|| "No choices in OpenAI response".to_string())?;
            let message = &choice["message"];
            let content = message["content"].as_str().unwrap_or("");
            // Emit any text content.
            if !content.is_empty() {
                emit(AgentEvent::Output {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    text: content.to_string(),
                });
            }
            // Check for tool calls.
            let tool_calls = message["tool_calls"].as_array();
            if tool_calls.is_none() || tool_calls.is_some_and(|tc| tc.is_empty()) {
                // No tool calls — model is done.
                emit(AgentEvent::Done {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    session_id: None,
                });
                return Ok(RuntimeResult {
                    session_id: None,
                    token_usage: Some(total_usage),
                });
            }
            let tool_calls = tool_calls.unwrap();
            // Add the assistant message (with tool_calls) to the conversation.
            messages.push(message.clone());
            // Execute each tool call via MCP and add results.
            for tc in tool_calls {
                if cancelled.load(Ordering::Relaxed) {
                    break;
                }
                let call_id = tc["id"].as_str().unwrap_or("");
                let function = &tc["function"];
                let tool_name = function["name"].as_str().unwrap_or("");
                let arguments_str = function["arguments"].as_str().unwrap_or("{}");
                let args: Value = serde_json::from_str(arguments_str).unwrap_or(json!({}));
                slog!(
                    "[openai] Calling MCP tool '{}' for {}:{}",
                    tool_name,
                    ctx.story_id,
                    ctx.agent_name
                );
                emit(AgentEvent::Output {
                    story_id: ctx.story_id.clone(),
                    agent_name: ctx.agent_name.clone(),
                    text: format!("\n[Tool call: {tool_name}]\n"),
                });
                let tool_result = call_mcp_tool(&client, &mcp_base, tool_name, &args).await;
                let result_content = match &tool_result {
                    Ok(result) => {
                        emit(AgentEvent::Output {
                            story_id: ctx.story_id.clone(),
                            agent_name: ctx.agent_name.clone(),
                            text: format!("[Tool result: {} chars]\n", result.len()),
                        });
                        result.clone()
                    }
                    Err(e) => {
                        emit(AgentEvent::Output {
                            story_id: ctx.story_id.clone(),
                            agent_name: ctx.agent_name.clone(),
                            text: format!("[Tool error: {e}]\n"),
                        });
                        format!("Error: {e}")
                    }
                };
                // OpenAI expects tool results as role=tool messages with
                // the matching tool_call_id.
                messages.push(json!({
                    "role": "tool",
                    "tool_call_id": call_id,
                    "content": result_content,
                }));
            }
        }
    }
    fn stop(&self) {
        self.cancelled.store(true, Ordering::Relaxed);
    }
    fn get_status(&self) -> RuntimeStatus {
        if self.cancelled.load(Ordering::Relaxed) {
            RuntimeStatus::Failed
        } else {
            RuntimeStatus::Idle
        }
    }
 }
 // ── Helper functions ─────────────────────────────────────────────────
 /// Build the system message text from the RuntimeContext.
 fn build_system_text(ctx: &RuntimeContext) -> String {
    ctx.args
        .iter()
        .position(|a| a == "--append-system-prompt")
        .and_then(|i| ctx.args.get(i + 1))
        .cloned()
        .unwrap_or_else(|| {
            format!(
                "You are an AI coding agent working on story {}. \
                 You have access to tools via function calling. \
                 Use them to complete the task. \
                 Work in the directory: {}",
                ctx.story_id, ctx.cwd
            )
        })
 }
 /// Fetch MCP tool definitions from storkit's MCP server and convert
 /// them to OpenAI function-calling format.
 async fn fetch_and_convert_mcp_tools(
    client: &Client,
    mcp_base: &str,
 ) -> Result<Vec<Value>, String> {
    let request = json!({
        "jsonrpc": "2.0",
        "id": 1,
        "method": "tools/list",
        "params": {}
    });
    let response = client
        .post(mcp_base)
        .json(&request)
        .send()
        .await
        .map_err(|e| format!("Failed to fetch MCP tools: {e}"))?;
    let body: Value = response
        .json()
        .await
        .map_err(|e| format!("Failed to parse MCP tools response: {e}"))?;
    let tools = body["result"]["tools"]
        .as_array()
        .ok_or_else(|| "No tools array in MCP response".to_string())?;
    let mut openai_tools = Vec::new();
    for tool in tools {
        let name = tool["name"].as_str().unwrap_or("").to_string();
        let description = tool["description"].as_str().unwrap_or("").to_string();
        if name.is_empty() {
            continue;
        }
        // OpenAI function calling uses JSON Schema natively for parameters,
        // so the MCP inputSchema can be used with minimal cleanup.
        let parameters = convert_mcp_schema_to_openai(tool.get("inputSchema"));
        openai_tools.push(json!({
            "type": "function",
            "function": {
                "name": name,
                "description": description,
                "parameters": parameters.unwrap_or_else(|| json!({"type": "object", "properties": {}})),
            }
        }));
    }
    slog!(
        "[openai] Loaded {} MCP tools as function definitions",
        openai_tools.len()
    );
    Ok(openai_tools)
 }
 /// Convert an MCP inputSchema (JSON Schema) to OpenAI-compatible
 /// function parameters.
 ///
 /// OpenAI uses JSON Schema natively, so less transformation is needed
 /// compared to Gemini. We still strip `$schema` to keep payloads clean.
 fn convert_mcp_schema_to_openai(schema: Option<&Value>) -> Option<Value> {
    let schema = schema?;
    let mut result = json!({
        "type": "object",
    });
    if let Some(properties) = schema.get("properties") {
        result["properties"] = clean_schema_properties(properties);
    } else {
        result["properties"] = json!({});
    }
    if let Some(required) = schema.get("required") {
        result["required"] = required.clone();
    }
    // OpenAI recommends additionalProperties: false for strict mode.
    result["additionalProperties"] = json!(false);
    Some(result)
 }
 /// Recursively clean schema properties, removing unsupported keywords.
 fn clean_schema_properties(properties: &Value) -> Value {
    let Some(obj) = properties.as_object() else {
        return properties.clone();
    };
    let mut cleaned = serde_json::Map::new();
    for (key, value) in obj {
        let mut prop = value.clone();
        if let Some(p) = prop.as_object_mut() {
            p.remove("$schema");
            // Recursively clean nested object properties.
            if let Some(nested_props) = p.get("properties").cloned() {
                p.insert(
                    "properties".to_string(),
                    clean_schema_properties(&nested_props),
                );
            }
            // Clean items schema for arrays.
            if let Some(items) = p.get("items").cloned()
                && let Some(items_obj) = items.as_object()
            {
                let mut cleaned_items = items_obj.clone();
                cleaned_items.remove("$schema");
                p.insert("items".to_string(), Value::Object(cleaned_items));
            }
        }
        cleaned.insert(key.clone(), prop);
    }
    Value::Object(cleaned)
 }
 /// Call an MCP tool via storkit's MCP server.
 async fn call_mcp_tool(
    client: &Client,
    mcp_base: &str,
    tool_name: &str,
    args: &Value,
 ) -> Result<String, String> {
    let request = json!({
        "jsonrpc": "2.0",
        "id": 1,
        "method": "tools/call",
        "params": {
            "name": tool_name,
            "arguments": args
        }
    });
    let response = client
        .post(mcp_base)
        .json(&request)
        .send()
        .await
        .map_err(|e| format!("MCP tool call failed: {e}"))?;
    let body: Value = response
        .json()
        .await
        .map_err(|e| format!("Failed to parse MCP tool response: {e}"))?;
    if let Some(error) = body.get("error") {
        let msg = error["message"].as_str().unwrap_or("Unknown MCP error");
        return Err(format!("MCP tool '{tool_name}' error: {msg}"));
    }
    // MCP tools/call returns { result: { content: [{ type: "text", text: "..." }] } }
    let content = &body["result"]["content"];
    if let Some(arr) = content.as_array() {
        let texts: Vec<&str> = arr
            .iter()
            .filter_map(|c| c["text"].as_str())
            .collect();
        if !texts.is_empty() {
            return Ok(texts.join("\n"));
        }
    }
    // Fall back to serializing the entire result.
    Ok(body["result"].to_string())
 }
 /// Parse token usage from an OpenAI API response.
 fn parse_usage(response: &Value) -> Option<TokenUsage> {
    let usage = response.get("usage")?;
    Some(TokenUsage {
        input_tokens: usage
            .get("prompt_tokens")
            .and_then(|v| v.as_u64())
            .unwrap_or(0),
        output_tokens: usage
            .get("completion_tokens")
            .and_then(|v| v.as_u64())
            .unwrap_or(0),
        cache_creation_input_tokens: 0,
        cache_read_input_tokens: 0,
        // OpenAI API doesn't report cost directly; leave at 0.
        total_cost_usd: 0.0,
    })
 }
 // ── Tests ────────────────────────────────────────────────────────────
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn convert_mcp_schema_simple_object() {
        let schema = json!({
            "type": "object",
            "properties": {
                "story_id": {
                    "type": "string",
                    "description": "Story identifier"
                }
            },
            "required": ["story_id"]
        });
        let result = convert_mcp_schema_to_openai(Some(&schema)).unwrap();
        assert_eq!(result["type"], "object");
        assert!(result["properties"]["story_id"].is_object());
        assert_eq!(result["required"][0], "story_id");
        assert_eq!(result["additionalProperties"], false);
    }
    #[test]
    fn convert_mcp_schema_empty_properties() {
        let schema = json!({
            "type": "object",
            "properties": {}
        });
        let result = convert_mcp_schema_to_openai(Some(&schema)).unwrap();
        assert_eq!(result["type"], "object");
        assert!(result["properties"].as_object().unwrap().is_empty());
    }
    #[test]
    fn convert_mcp_schema_none_returns_none() {
        assert!(convert_mcp_schema_to_openai(None).is_none());
    }
    #[test]
    fn convert_mcp_schema_strips_dollar_schema() {
        let schema = json!({
            "type": "object",
            "properties": {
                "name": {
                    "type": "string",
                    "$schema": "http://json-schema.org/draft-07/schema#"
                }
            }
        });
        let result = convert_mcp_schema_to_openai(Some(&schema)).unwrap();
        let name_prop = &result["properties"]["name"];
        assert!(name_prop.get("$schema").is_none());
        assert_eq!(name_prop["type"], "string");
    }
    #[test]
    fn convert_mcp_schema_with_nested_objects() {
        let schema = json!({
            "type": "object",
            "properties": {
                "config": {
                    "type": "object",
                    "properties": {
                        "key": { "type": "string" }
                    }
                }
            }
        });
        let result = convert_mcp_schema_to_openai(Some(&schema)).unwrap();
        assert!(result["properties"]["config"]["properties"]["key"].is_object());
    }
    #[test]
    fn convert_mcp_schema_with_array_items() {
        let schema = json!({
            "type": "object",
            "properties": {
                "items": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": { "type": "string" }
                        },
                        "$schema": "http://json-schema.org/draft-07/schema#"
                    }
                }
            }
        });
        let result = convert_mcp_schema_to_openai(Some(&schema)).unwrap();
        let items_schema = &result["properties"]["items"]["items"];
        assert!(items_schema.get("$schema").is_none());
    }
    #[test]
    fn build_system_text_uses_args() {
        let ctx = RuntimeContext {
            story_id: "42_story_test".to_string(),
            agent_name: "coder-1".to_string(),
            command: "gpt-4o".to_string(),
            args: vec![
                "--append-system-prompt".to_string(),
                "Custom system prompt".to_string(),
            ],
            prompt: "Do the thing".to_string(),
            cwd: "/tmp/wt".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        assert_eq!(build_system_text(&ctx), "Custom system prompt");
    }
    #[test]
    fn build_system_text_default() {
        let ctx = RuntimeContext {
            story_id: "42_story_test".to_string(),
            agent_name: "coder-1".to_string(),
            command: "gpt-4o".to_string(),
            args: vec![],
            prompt: "Do the thing".to_string(),
            cwd: "/tmp/wt".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        let text = build_system_text(&ctx);
        assert!(text.contains("42_story_test"));
        assert!(text.contains("/tmp/wt"));
    }
    #[test]
    fn parse_usage_valid() {
        let response = json!({
            "usage": {
                "prompt_tokens": 100,
                "completion_tokens": 50,
                "total_tokens": 150
            }
        });
        let usage = parse_usage(&response).unwrap();
        assert_eq!(usage.input_tokens, 100);
        assert_eq!(usage.output_tokens, 50);
        assert_eq!(usage.cache_creation_input_tokens, 0);
        assert_eq!(usage.total_cost_usd, 0.0);
    }
    #[test]
    fn parse_usage_missing() {
        let response = json!({"choices": []});
        assert!(parse_usage(&response).is_none());
    }
    #[test]
    fn openai_runtime_stop_sets_cancelled() {
        let runtime = OpenAiRuntime::new();
        assert_eq!(runtime.get_status(), RuntimeStatus::Idle);
        runtime.stop();
        assert_eq!(runtime.get_status(), RuntimeStatus::Failed);
    }
    #[test]
    fn model_extraction_from_command_gpt() {
        let ctx = RuntimeContext {
            story_id: "1".to_string(),
            agent_name: "coder".to_string(),
            command: "gpt-4o".to_string(),
            args: vec![],
            prompt: "test".to_string(),
            cwd: "/tmp".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        assert!(ctx.command.starts_with("gpt"));
    }
    #[test]
    fn model_extraction_from_command_o3() {
        let ctx = RuntimeContext {
            story_id: "1".to_string(),
            agent_name: "coder".to_string(),
            command: "o3".to_string(),
            args: vec![],
            prompt: "test".to_string(),
            cwd: "/tmp".to_string(),
            inactivity_timeout_secs: 300,
            mcp_port: 3001,
        };
        assert!(ctx.command.starts_with("o"));
    }
 }
--- a/server/src/config.rs
+++ b/server/src/config.rs
@@ -117,6 +117,11 @@ pub struct AgentConfig {
    /// and marked as Failed. Default: 300 (5 minutes). Set to 0 to disable.
    #[serde(default = "default_inactivity_timeout_secs")]
    pub inactivity_timeout_secs: u64,
    /// Agent runtime backend. Controls how the agent process is spawned and
    /// how events are streamed. Default: `"claude-code"` (spawns the `claude`
    /// CLI in a PTY). Future values: `"openai"`, `"gemini"`.
    #[serde(default)]
    pub runtime: Option<String>,
 }
 fn default_path() -> String {
@@ -178,6 +183,7 @@ impl Default for ProjectConfig {
                system_prompt: None,
                stage: None,
                inactivity_timeout_secs: default_inactivity_timeout_secs(),
                runtime: None,
            }],
            watcher: WatcherConfig::default(),
            default_qa: default_qa(),
@@ -370,6 +376,17 @@ fn validate_agents(agents: &[AgentConfig]) -> Result<(), String> {
                    agent.name
                ));
            }
        if let Some(ref runtime) = agent.runtime {
            match runtime.as_str() {
                "claude-code" | "gemini" => {}
                other => {
                    return Err(format!(
                        "Agent '{}': unknown runtime '{other}'. Supported: 'claude-code', 'gemini'",
                        agent.name
                    ));
                }
            }
        }
    }
    Ok(())
 }
@@ -792,6 +809,55 @@ name = "coder-1"
        assert_eq!(config.max_coders, Some(3));
    }
    // ── runtime config ────────────────────────────────────────────────
    #[test]
    fn runtime_defaults_to_none() {
        let toml_str = r#"
 [[agent]]
 name = "coder"
 "#;
        let config = ProjectConfig::parse(toml_str).unwrap();
        assert_eq!(config.agent[0].runtime, None);
    }
    #[test]
    fn runtime_claude_code_accepted() {
        let toml_str = r#"
 [[agent]]
 name = "coder"
 runtime = "claude-code"
 "#;
        let config = ProjectConfig::parse(toml_str).unwrap();
        assert_eq!(
            config.agent[0].runtime,
            Some("claude-code".to_string())
        );
    }
    #[test]
    fn runtime_gemini_accepted() {
        let toml_str = r#"
 [[agent]]
 name = "coder"
 runtime = "gemini"
 model = "gemini-2.5-pro"
 "#;
        let config = ProjectConfig::parse(toml_str).unwrap();
        assert_eq!(config.agent[0].runtime, Some("gemini".to_string()));
    }
    #[test]
    fn runtime_unknown_rejected() {
        let toml_str = r#"
 [[agent]]
 name = "coder"
 runtime = "openai"
 "#;
        let err = ProjectConfig::parse(toml_str).unwrap_err();
        assert!(err.contains("unknown runtime 'openai'"));
    }
    #[test]
    fn project_toml_has_three_sonnet_coders() {
        let manifest_dir = std::path::Path::new(env!("CARGO_MANIFEST_DIR"));
--- a/server/src/http/anthropic.rs
+++ b/server/src/http/anthropic.rs
@@ -3,7 +3,7 @@ use crate::llm::chat;
 use crate::store::StoreOps;
 use poem_openapi::{Object, OpenApi, Tags, payload::Json};
 use reqwest::header::{HeaderMap, HeaderValue};
-use serde::Deserialize;
+use serde::{Deserialize, Serialize};
 use std::sync::Arc;
 const ANTHROPIC_MODELS_URL: &str = "https://api.anthropic.com/v1/models";
@@ -18,6 +18,13 @@ struct AnthropicModelsResponse {
 #[derive(Deserialize)]
 struct AnthropicModelInfo {
    id: String,
    context_window: u64,
 }
 #[derive(Serialize, Object)]
 struct AnthropicModelSummary {
    id: String,
    context_window: u64,
 }
 fn get_anthropic_api_key(ctx: &AppContext) -> Result<String, String> {
@@ -84,7 +91,7 @@ impl AnthropicApi {
    /// List available Anthropic models.
    #[oai(path = "/anthropic/models", method = "get")]
-    async fn list_anthropic_models(&self) -> OpenApiResult<Json<Vec<String>>> {
+    async fn list_anthropic_models(&self) -> OpenApiResult<Json<Vec<AnthropicModelSummary>>> {
        self.list_anthropic_models_from(ANTHROPIC_MODELS_URL).await
    }
 }
@@ -93,7 +100,7 @@ impl AnthropicApi {
    async fn list_anthropic_models_from(
        &self,
        url: &str,
-    ) -> OpenApiResult<Json<Vec<String>>> {
+    ) -> OpenApiResult<Json<Vec<AnthropicModelSummary>>> {
        let api_key = get_anthropic_api_key(self.ctx.as_ref()).map_err(bad_request)?;
        let client = reqwest::Client::new();
        let mut headers = HeaderMap::new();
@@ -128,7 +135,14 @@ impl AnthropicApi {
            .json::<AnthropicModelsResponse>()
            .await
            .map_err(|e| bad_request(e.to_string()))?;
-        let models = body.data.into_iter().map(|m| m.id).collect();
+        let models = body
            .data
            .into_iter()
            .map(|m| AnthropicModelSummary {
                id: m.id,
                context_window: m.context_window,
            })
            .collect();
        Ok(Json(models))
    }
@@ -276,4 +290,29 @@ mod tests {
        let dir = TempDir::new().unwrap();
        let _api = make_api(&dir);
    }
    #[test]
    fn anthropic_model_info_deserializes_context_window() {
        let json = json!({
            "id": "claude-opus-4-5",
            "context_window": 200000
        });
        let info: AnthropicModelInfo = serde_json::from_value(json).unwrap();
        assert_eq!(info.id, "claude-opus-4-5");
        assert_eq!(info.context_window, 200000);
    }
    #[test]
    fn anthropic_models_response_deserializes_multiple_models() {
        let json = json!({
            "data": [
                { "id": "claude-opus-4-5", "context_window": 200000 },
                { "id": "claude-haiku-4-5-20251001", "context_window": 100000 }
            ]
        });
        let response: AnthropicModelsResponse = serde_json::from_value(json).unwrap();
        assert_eq!(response.data.len(), 2);
        assert_eq!(response.data[0].context_window, 200000);
        assert_eq!(response.data[1].context_window, 100000);
    }
 }
--- a/server/src/http/mcp/diagnostics.rs
+++ b/server/src/http/mcp/diagnostics.rs
@@ -1,4 +1,4 @@
-use crate::agents::{AgentStatus, move_story_to_stage};
+use crate::agents::move_story_to_stage;
 use crate::http::context::AppContext;
 use crate::log_buffer;
 use crate::slog;
@@ -26,98 +26,11 @@ pub(super) fn tool_get_server_logs(args: &Value) -> Result<String, String> {
    Ok(all_lines[start..].join("\n"))
 }
-/// Rebuild the server binary and re-exec.
+/// Rebuild the server binary and re-exec (delegates to `crate::rebuild`).
 ///
 /// 1. Gracefully stops all running agents (kills PTY children).
 /// 2. Runs `cargo build [-p storkit]` from the workspace root, matching
 ///    the current build profile (debug or release).
 /// 3. If the build fails, returns the build error (server stays up).
 /// 4. If the build succeeds, re-execs the process with the new binary via
 ///    `std::os::unix::process::CommandExt::exec()`.
 pub(super) async fn tool_rebuild_and_restart(ctx: &AppContext) -> Result<String, String> {
    slog!("[rebuild] Rebuild and restart requested via MCP tool");
-
+    let project_root = ctx.state.get_project_root().unwrap_or_default();
-    // 1. Gracefully stop all running agents.
+    crate::rebuild::rebuild_and_restart(&ctx.agents, &project_root).await
    let running_agents = ctx.agents.list_agents().unwrap_or_default();
    let running_count = running_agents
        .iter()
        .filter(|a| a.status == AgentStatus::Running)
        .count();
    if running_count > 0 {
        slog!("[rebuild] Stopping {running_count} running agent(s) before rebuild");
    }
    ctx.agents.kill_all_children();
    // 2. Find the workspace root (parent of the server binary's source).
    //    CARGO_MANIFEST_DIR at compile time points to the `server/` crate;
    //    the workspace root is its parent.
    let manifest_dir = std::path::Path::new(env!("CARGO_MANIFEST_DIR"));
    let workspace_root = manifest_dir
        .parent()
        .ok_or_else(|| "Cannot determine workspace root from CARGO_MANIFEST_DIR".to_string())?;
    slog!(
        "[rebuild] Building server from workspace root: {}",
        workspace_root.display()
    );
    // 3. Build the server binary, matching the current build profile so the
    //    re-exec via current_exe() picks up the new binary.
    let build_args: Vec<&str> = if cfg!(debug_assertions) {
        vec!["build", "-p", "storkit"]
    } else {
        vec!["build", "--release", "-p", "storkit"]
    };
    slog!("[rebuild] cargo {}", build_args.join(" "));
    let output = tokio::task::spawn_blocking({
        let workspace_root = workspace_root.to_path_buf();
        move || {
            std::process::Command::new("cargo")
                .args(&build_args)
                .current_dir(&workspace_root)
                .output()
        }
    })
    .await
    .map_err(|e| format!("Build task panicked: {e}"))?
    .map_err(|e| format!("Failed to run cargo build: {e}"))?;
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        slog!("[rebuild] Build failed:\n{stderr}");
        return Err(format!("Build failed:\n{stderr}"));
    }
    slog!("[rebuild] Build succeeded, re-execing with new binary");
    // 4. Re-exec with the new binary.
    //    Collect current argv so we preserve any CLI arguments (e.g. project path).
    let current_exe =
        std::env::current_exe().map_err(|e| format!("Cannot determine current executable: {e}"))?;
    let args: Vec<String> = std::env::args().collect();
    // Remove the port file before re-exec so the new process can write its own.
    if let Ok(root) = ctx.state.get_project_root() {
        let port_file = root.join(".storkit_port");
        if port_file.exists() {
            let _ = std::fs::remove_file(&port_file);
        }
    }
    // Also check cwd for port file.
    let cwd_port_file = std::path::Path::new(".storkit_port");
    if cwd_port_file.exists() {
        let _ = std::fs::remove_file(cwd_port_file);
    }
    // Use exec() to replace the current process.
    // This never returns on success.
    use std::os::unix::process::CommandExt;
    let err = std::process::Command::new(&current_exe)
        .args(&args[1..])
        .exec();
    // If we get here, exec() failed.
    Err(format!("Failed to exec new binary: {err}"))
 }
 /// Generate a Claude Code permission rule string for the given tool name and input.
--- a/server/src/http/mcp/git_tools.rs
+++ b/server/src/http/mcp/git_tools.rs
@@ -0,0 +1,766 @@
 use crate::http::context::AppContext;
 use serde_json::{json, Value};
 use std::path::PathBuf;
 /// Validates that `worktree_path` exists and is inside the project's
 /// `.storkit/worktrees/` directory. Returns the canonicalized path.
 fn validate_worktree_path(worktree_path: &str, ctx: &AppContext) -> Result<PathBuf, String> {
    let wd = PathBuf::from(worktree_path);
    if !wd.is_absolute() {
        return Err("worktree_path must be an absolute path".to_string());
    }
    if !wd.exists() {
        return Err(format!(
            "worktree_path does not exist: {worktree_path}"
        ));
    }
    let project_root = ctx.agents.get_project_root(&ctx.state)?;
    let worktrees_root = project_root.join(".storkit").join("worktrees");
    let canonical_wd = wd
        .canonicalize()
        .map_err(|e| format!("Cannot canonicalize worktree_path: {e}"))?;
    let canonical_wt = if worktrees_root.exists() {
        worktrees_root
            .canonicalize()
            .map_err(|e| format!("Cannot canonicalize worktrees root: {e}"))?
    } else {
        return Err("No worktrees directory found in project".to_string());
    };
    if !canonical_wd.starts_with(&canonical_wt) {
        return Err(format!(
            "worktree_path must be inside .storkit/worktrees/. Got: {worktree_path}"
        ));
    }
    Ok(canonical_wd)
 }
 /// Run a git command in the given directory and return its output.
 async fn run_git(args: Vec<&'static str>, dir: PathBuf) -> Result<std::process::Output, String> {
    tokio::task::spawn_blocking(move || {
        std::process::Command::new("git")
            .args(&args)
            .current_dir(&dir)
            .output()
    })
    .await
    .map_err(|e| format!("Task join error: {e}"))?
    .map_err(|e| format!("Failed to run git: {e}"))
 }
 /// Run a git command with owned args in the given directory.
 async fn run_git_owned(args: Vec<String>, dir: PathBuf) -> Result<std::process::Output, String> {
    tokio::task::spawn_blocking(move || {
        std::process::Command::new("git")
            .args(&args)
            .current_dir(&dir)
            .output()
    })
    .await
    .map_err(|e| format!("Task join error: {e}"))?
    .map_err(|e| format!("Failed to run git: {e}"))
 }
 /// git_status — returns working tree status (staged, unstaged, untracked files).
 pub(super) async fn tool_git_status(args: &Value, ctx: &AppContext) -> Result<String, String> {
    let worktree_path = args
        .get("worktree_path")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: worktree_path")?;
    let dir = validate_worktree_path(worktree_path, ctx)?;
    let output = run_git(vec!["status", "--porcelain=v1", "-u"], dir).await?;
    let stdout = String::from_utf8_lossy(&output.stdout);
    let stderr = String::from_utf8_lossy(&output.stderr);
    if !output.status.success() {
        return Err(format!(
            "git status failed (exit {}): {stderr}",
            output.status.code().unwrap_or(-1)
        ));
    }
    let mut staged: Vec<String> = Vec::new();
    let mut unstaged: Vec<String> = Vec::new();
    let mut untracked: Vec<String> = Vec::new();
    for line in stdout.lines() {
        if line.len() < 3 {
            continue;
        }
        let x = line.chars().next().unwrap_or(' ');
        let y = line.chars().nth(1).unwrap_or(' ');
        let path = line[3..].to_string();
        match (x, y) {
            ('?', '?') => untracked.push(path),
            (' ', _) => unstaged.push(path),
            (_, ' ') => staged.push(path),
            _ => {
                // Both staged and unstaged modifications
                staged.push(path.clone());
                unstaged.push(path);
            }
        }
    }
    serde_json::to_string_pretty(&json!({
        "staged": staged,
        "unstaged": unstaged,
        "untracked": untracked,
        "clean": staged.is_empty() && unstaged.is_empty() && untracked.is_empty(),
    }))
    .map_err(|e| format!("Serialization error: {e}"))
 }
 /// git_diff — returns diff output. Supports staged/unstaged/commit range.
 pub(super) async fn tool_git_diff(args: &Value, ctx: &AppContext) -> Result<String, String> {
    let worktree_path = args
        .get("worktree_path")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: worktree_path")?;
    let dir = validate_worktree_path(worktree_path, ctx)?;
    let staged = args
        .get("staged")
        .and_then(|v| v.as_bool())
        .unwrap_or(false);
    let commit_range = args
        .get("commit_range")
        .and_then(|v| v.as_str())
        .map(|s| s.to_string());
    let mut git_args: Vec<String> = vec!["diff".to_string()];
    if staged {
        git_args.push("--staged".to_string());
    }
    if let Some(range) = commit_range {
        git_args.push(range);
    }
    let output = run_git_owned(git_args, dir).await?;
    let stdout = String::from_utf8_lossy(&output.stdout);
    let stderr = String::from_utf8_lossy(&output.stderr);
    if !output.status.success() {
        return Err(format!(
            "git diff failed (exit {}): {stderr}",
            output.status.code().unwrap_or(-1)
        ));
    }
    serde_json::to_string_pretty(&json!({
        "diff": stdout.as_ref(),
        "exit_code": output.status.code().unwrap_or(-1),
    }))
    .map_err(|e| format!("Serialization error: {e}"))
 }
 /// git_add — stages files by path.
 pub(super) async fn tool_git_add(args: &Value, ctx: &AppContext) -> Result<String, String> {
    let worktree_path = args
        .get("worktree_path")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: worktree_path")?;
    let paths: Vec<String> = args
        .get("paths")
        .and_then(|v| v.as_array())
        .ok_or("Missing required argument: paths (must be an array of strings)")?
        .iter()
        .filter_map(|v| v.as_str().map(|s| s.to_string()))
        .collect();
    if paths.is_empty() {
        return Err("paths must be a non-empty array of strings".to_string());
    }
    let dir = validate_worktree_path(worktree_path, ctx)?;
    let mut git_args: Vec<String> = vec!["add".to_string(), "--".to_string()];
    git_args.extend(paths.clone());
    let output = run_git_owned(git_args, dir).await?;
    let stderr = String::from_utf8_lossy(&output.stderr);
    if !output.status.success() {
        return Err(format!(
            "git add failed (exit {}): {stderr}",
            output.status.code().unwrap_or(-1)
        ));
    }
    serde_json::to_string_pretty(&json!({
        "staged": paths,
        "exit_code": output.status.code().unwrap_or(0),
    }))
    .map_err(|e| format!("Serialization error: {e}"))
 }
 /// git_commit — commits staged changes with a message.
 pub(super) async fn tool_git_commit(args: &Value, ctx: &AppContext) -> Result<String, String> {
    let worktree_path = args
        .get("worktree_path")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: worktree_path")?;
    let message = args
        .get("message")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: message")?
        .to_string();
    if message.trim().is_empty() {
        return Err("message must not be empty".to_string());
    }
    let dir = validate_worktree_path(worktree_path, ctx)?;
    let git_args: Vec<String> = vec![
        "commit".to_string(),
        "--message".to_string(),
        message,
    ];
    let output = run_git_owned(git_args, dir).await?;
    let stdout = String::from_utf8_lossy(&output.stdout);
    let stderr = String::from_utf8_lossy(&output.stderr);
    if !output.status.success() {
        return Err(format!(
            "git commit failed (exit {}): {stderr}",
            output.status.code().unwrap_or(-1)
        ));
    }
    serde_json::to_string_pretty(&json!({
        "output": stdout.as_ref(),
        "exit_code": output.status.code().unwrap_or(0),
    }))
    .map_err(|e| format!("Serialization error: {e}"))
 }
 /// git_log — returns commit history with configurable count and format.
 pub(super) async fn tool_git_log(args: &Value, ctx: &AppContext) -> Result<String, String> {
    let worktree_path = args
        .get("worktree_path")
        .and_then(|v| v.as_str())
        .ok_or("Missing required argument: worktree_path")?;
    let dir = validate_worktree_path(worktree_path, ctx)?;
    let count = args
        .get("count")
        .and_then(|v| v.as_u64())
        .unwrap_or(10)
        .min(500);
    let format = args
        .get("format")
        .and_then(|v| v.as_str())
        .unwrap_or("%H%x09%s%x09%an%x09%ai")
        .to_string();
    let git_args: Vec<String> = vec![
        "log".to_string(),
        format!("--max-count={count}"),
        format!("--pretty=format:{format}"),
    ];
    let output = run_git_owned(git_args, dir).await?;
    let stdout = String::from_utf8_lossy(&output.stdout);
    let stderr = String::from_utf8_lossy(&output.stderr);
    if !output.status.success() {
        return Err(format!(
            "git log failed (exit {}): {stderr}",
            output.status.code().unwrap_or(-1)
        ));
    }
    serde_json::to_string_pretty(&json!({
        "log": stdout.as_ref(),
        "exit_code": output.status.code().unwrap_or(0),
    }))
    .map_err(|e| format!("Serialization error: {e}"))
 }
 #[cfg(test)]
 mod tests {
    use super::*;
    use crate::http::context::AppContext;
    use serde_json::json;
    fn test_ctx(dir: &std::path::Path) -> AppContext {
        AppContext::new_test(dir.to_path_buf())
    }
    /// Create a temp directory with a git worktree structure and init a repo.
    fn setup_worktree() -> (tempfile::TempDir, PathBuf, AppContext) {
        let tmp = tempfile::tempdir().unwrap();
        let story_wt = tmp
            .path()
            .join(".storkit")
            .join("worktrees")
            .join("42_test_story");
        std::fs::create_dir_all(&story_wt).unwrap();
        // Init git repo in the worktree
        std::process::Command::new("git")
            .args(["init"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["config", "user.email", "test@test.com"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["config", "user.name", "Test"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let ctx = test_ctx(tmp.path());
        (tmp, story_wt, ctx)
    }
    // ── validate_worktree_path ─────────────────────────────────────────
    #[test]
    fn validate_rejects_relative_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = validate_worktree_path("relative/path", &ctx);
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("absolute"));
    }
    #[test]
    fn validate_rejects_nonexistent_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = validate_worktree_path("/nonexistent_path_xyz_git", &ctx);
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("does not exist"));
    }
    #[test]
    fn validate_rejects_path_outside_worktrees() {
        let tmp = tempfile::tempdir().unwrap();
        let wt_dir = tmp.path().join(".storkit").join("worktrees");
        std::fs::create_dir_all(&wt_dir).unwrap();
        let ctx = test_ctx(tmp.path());
        let result = validate_worktree_path(tmp.path().to_str().unwrap(), &ctx);
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("inside .storkit/worktrees"));
    }
    #[test]
    fn validate_accepts_path_inside_worktrees() {
        let tmp = tempfile::tempdir().unwrap();
        let story_wt = tmp
            .path()
            .join(".storkit")
            .join("worktrees")
            .join("42_test_story");
        std::fs::create_dir_all(&story_wt).unwrap();
        let ctx = test_ctx(tmp.path());
        let result = validate_worktree_path(story_wt.to_str().unwrap(), &ctx);
        assert!(result.is_ok(), "expected Ok, got: {:?}", result);
    }
    // ── git_status ────────────────────────────────────────────────────
    #[tokio::test]
    async fn git_status_missing_worktree_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = tool_git_status(&json!({}), &ctx).await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("worktree_path"));
    }
    #[tokio::test]
    async fn git_status_clean_repo() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Make an initial commit so HEAD exists
        std::fs::write(story_wt.join("readme.txt"), "hello").unwrap();
        std::process::Command::new("git")
            .args(["add", "."])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["commit", "-m", "init"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let result = tool_git_status(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert_eq!(parsed["clean"], true);
        assert!(parsed["staged"].as_array().unwrap().is_empty());
        assert!(parsed["unstaged"].as_array().unwrap().is_empty());
        assert!(parsed["untracked"].as_array().unwrap().is_empty());
    }
    #[tokio::test]
    async fn git_status_shows_untracked_file() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Make initial commit
        std::fs::write(story_wt.join("readme.txt"), "hello").unwrap();
        std::process::Command::new("git")
            .args(["add", "."])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["commit", "-m", "init"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        // Add untracked file
        std::fs::write(story_wt.join("new_file.txt"), "content").unwrap();
        let result = tool_git_status(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert_eq!(parsed["clean"], false);
        let untracked = parsed["untracked"].as_array().unwrap();
        assert!(
            untracked.iter().any(|v| v.as_str().unwrap().contains("new_file.txt")),
            "expected new_file.txt in untracked: {parsed}"
        );
    }
    // ── git_diff ──────────────────────────────────────────────────────
    #[tokio::test]
    async fn git_diff_missing_worktree_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = tool_git_diff(&json!({}), &ctx).await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("worktree_path"));
    }
    #[tokio::test]
    async fn git_diff_returns_diff() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Create initial commit
        std::fs::write(story_wt.join("file.txt"), "line1\n").unwrap();
        std::process::Command::new("git")
            .args(["add", "."])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["commit", "-m", "init"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        // Modify file (unstaged)
        std::fs::write(story_wt.join("file.txt"), "line1\nline2\n").unwrap();
        let result = tool_git_diff(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert!(
            parsed["diff"].as_str().unwrap().contains("line2"),
            "expected diff output: {parsed}"
        );
    }
    #[tokio::test]
    async fn git_diff_staged_flag() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Create initial commit
        std::fs::write(story_wt.join("file.txt"), "line1\n").unwrap();
        std::process::Command::new("git")
            .args(["add", "."])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["commit", "-m", "init"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        // Stage a modification
        std::fs::write(story_wt.join("file.txt"), "line1\nstaged_change\n").unwrap();
        std::process::Command::new("git")
            .args(["add", "file.txt"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let result = tool_git_diff(
            &json!({"worktree_path": story_wt.to_str().unwrap(), "staged": true}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert!(
            parsed["diff"].as_str().unwrap().contains("staged_change"),
            "expected staged diff: {parsed}"
        );
    }
    // ── git_add ───────────────────────────────────────────────────────
    #[tokio::test]
    async fn git_add_missing_worktree_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = tool_git_add(&json!({"paths": ["file.txt"]}), &ctx).await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("worktree_path"));
    }
    #[tokio::test]
    async fn git_add_missing_paths() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        let result = tool_git_add(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("paths"));
    }
    #[tokio::test]
    async fn git_add_empty_paths() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        let result = tool_git_add(
            &json!({"worktree_path": story_wt.to_str().unwrap(), "paths": []}),
            &ctx,
        )
        .await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("non-empty"));
    }
    #[tokio::test]
    async fn git_add_stages_file() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        std::fs::write(story_wt.join("file.txt"), "content").unwrap();
        let result = tool_git_add(
            &json!({
                "worktree_path": story_wt.to_str().unwrap(),
                "paths": ["file.txt"]
            }),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert_eq!(parsed["exit_code"], 0);
        let staged = parsed["staged"].as_array().unwrap();
        assert!(staged.iter().any(|v| v.as_str().unwrap() == "file.txt"));
        // Verify file is actually staged
        let status = std::process::Command::new("git")
            .args(["status", "--porcelain"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let output = String::from_utf8_lossy(&status.stdout);
        assert!(output.contains("A  file.txt"), "file should be staged: {output}");
    }
    // ── git_commit ────────────────────────────────────────────────────
    #[tokio::test]
    async fn git_commit_missing_worktree_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = tool_git_commit(&json!({"message": "test"}), &ctx).await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("worktree_path"));
    }
    #[tokio::test]
    async fn git_commit_missing_message() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        let result = tool_git_commit(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("message"));
    }
    #[tokio::test]
    async fn git_commit_empty_message() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        let result = tool_git_commit(
            &json!({"worktree_path": story_wt.to_str().unwrap(), "message": "   "}),
            &ctx,
        )
        .await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("empty"));
    }
    #[tokio::test]
    async fn git_commit_creates_commit() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Stage a file
        std::fs::write(story_wt.join("file.txt"), "content").unwrap();
        std::process::Command::new("git")
            .args(["add", "file.txt"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let result = tool_git_commit(
            &json!({
                "worktree_path": story_wt.to_str().unwrap(),
                "message": "test commit message"
            }),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert_eq!(parsed["exit_code"], 0);
        // Verify commit exists
        let log = std::process::Command::new("git")
            .args(["log", "--oneline"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let log_output = String::from_utf8_lossy(&log.stdout);
        assert!(
            log_output.contains("test commit message"),
            "expected commit in log: {log_output}"
        );
    }
    // ── git_log ───────────────────────────────────────────────────────
    #[tokio::test]
    async fn git_log_missing_worktree_path() {
        let tmp = tempfile::tempdir().unwrap();
        let ctx = test_ctx(tmp.path());
        let result = tool_git_log(&json!({}), &ctx).await;
        assert!(result.is_err());
        assert!(result.unwrap_err().contains("worktree_path"));
    }
    #[tokio::test]
    async fn git_log_returns_history() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Make a commit
        std::fs::write(story_wt.join("file.txt"), "content").unwrap();
        std::process::Command::new("git")
            .args(["add", "."])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        std::process::Command::new("git")
            .args(["commit", "-m", "first commit"])
            .current_dir(&story_wt)
            .output()
            .unwrap();
        let result = tool_git_log(
            &json!({"worktree_path": story_wt.to_str().unwrap()}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        assert_eq!(parsed["exit_code"], 0);
        assert!(
            parsed["log"].as_str().unwrap().contains("first commit"),
            "expected commit in log: {parsed}"
        );
    }
    #[tokio::test]
    async fn git_log_respects_count() {
        let (_tmp, story_wt, ctx) = setup_worktree();
        // Make multiple commits
        for i in 0..5 {
            std::fs::write(story_wt.join("file.txt"), format!("content {i}")).unwrap();
            std::process::Command::new("git")
                .args(["add", "."])
                .current_dir(&story_wt)
                .output()
                .unwrap();
            std::process::Command::new("git")
                .args(["commit", "-m", &format!("commit {i}")])
                .current_dir(&story_wt)
                .output()
                .unwrap();
        }
        let result = tool_git_log(
            &json!({"worktree_path": story_wt.to_str().unwrap(), "count": 2}),
            &ctx,
        )
        .await
        .unwrap();
        let parsed: serde_json::Value = serde_json::from_str(&result).unwrap();
        // With count=2, only 2 commit entries should appear
        let log = parsed["log"].as_str().unwrap();
        // Each log line is tab-separated; count newlines
        let lines: Vec<&str> = log.lines().collect();
        assert_eq!(lines.len(), 2, "expected 2 log entries, got: {log}");
    }
 }
--- a/server/src/http/mcp/mod.rs
+++ b/server/src/http/mcp/mod.rs
@@ -10,6 +10,7 @@ use std::sync::Arc;
 pub mod agent_tools;
 pub mod diagnostics;
 pub mod git_tools;
 pub mod merge_tools;
 pub mod qa_tools;
 pub mod shell_tools;
@@ -1025,6 +1026,101 @@ fn handle_tools_list(id: Option<Value>) -> JsonRpcResponse {
                        },
                        "required": ["command", "working_dir"]
                    }
                },
                {
                    "name": "git_status",
                    "description": "Return the working tree status of an agent's worktree (staged, unstaged, and untracked files). The worktree_path must be inside .storkit/worktrees/. Push and remote operations are not available.",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "worktree_path": {
                                "type": "string",
                                "description": "Absolute path to the worktree directory. Must be inside .storkit/worktrees/."
                            }
                        },
                        "required": ["worktree_path"]
                    }
                },
                {
                    "name": "git_diff",
                    "description": "Return diff output for an agent's worktree. Supports unstaged (default), staged, or a commit range. The worktree_path must be inside .storkit/worktrees/.",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "worktree_path": {
                                "type": "string",
                                "description": "Absolute path to the worktree directory. Must be inside .storkit/worktrees/."
                            },
                            "staged": {
                                "type": "boolean",
                                "description": "If true, show staged diff (--staged). Default: false."
                            },
                            "commit_range": {
                                "type": "string",
                                "description": "Optional commit range (e.g. 'HEAD~3..HEAD', 'abc123..def456')."
                            }
                        },
                        "required": ["worktree_path"]
                    }
                },
                {
                    "name": "git_add",
                    "description": "Stage files by path in an agent's worktree. The worktree_path must be inside .storkit/worktrees/.",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "worktree_path": {
                                "type": "string",
                                "description": "Absolute path to the worktree directory. Must be inside .storkit/worktrees/."
                            },
                            "paths": {
                                "type": "array",
                                "items": { "type": "string" },
                                "description": "List of file paths to stage (relative to worktree_path)."
                            }
                        },
                        "required": ["worktree_path", "paths"]
                    }
                },
                {
                    "name": "git_commit",
                    "description": "Commit staged changes in an agent's worktree with the given message. The worktree_path must be inside .storkit/worktrees/. Push and remote operations are not available.",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "worktree_path": {
                                "type": "string",
                                "description": "Absolute path to the worktree directory. Must be inside .storkit/worktrees/."
                            },
                            "message": {
                                "type": "string",
                                "description": "Commit message."
                            }
                        },
                        "required": ["worktree_path", "message"]
                    }
                },
                {
                    "name": "git_log",
                    "description": "Return commit history for an agent's worktree with configurable count and format. The worktree_path must be inside .storkit/worktrees/.",
                    "inputSchema": {
                        "type": "object",
                        "properties": {
                            "worktree_path": {
                                "type": "string",
                                "description": "Absolute path to the worktree directory. Must be inside .storkit/worktrees/."
                            },
                            "count": {
                                "type": "integer",
                                "description": "Number of commits to return (default: 10, max: 500)."
                            },
                            "format": {
                                "type": "string",
                                "description": "git pretty-format string (default: '%H%x09%s%x09%an%x09%ai')."
                            }
                        },
                        "required": ["worktree_path"]
                    }
                }
            ]
        }),
@@ -1107,6 +1203,12 @@ async fn handle_tools_call(
        "move_story" => diagnostics::tool_move_story(&args, ctx),
        // Shell command execution
        "run_command" => shell_tools::tool_run_command(&args, ctx).await,
        // Git operations
        "git_status" => git_tools::tool_git_status(&args, ctx).await,
        "git_diff" => git_tools::tool_git_diff(&args, ctx).await,
        "git_add" => git_tools::tool_git_add(&args, ctx).await,
        "git_commit" => git_tools::tool_git_commit(&args, ctx).await,
        "git_log" => git_tools::tool_git_log(&args, ctx).await,
        _ => Err(format!("Unknown tool: {tool_name}")),
    };
@@ -1217,7 +1319,12 @@ mod tests {
        assert!(names.contains(&"move_story"));
        assert!(names.contains(&"delete_story"));
        assert!(names.contains(&"run_command"));
-        assert_eq!(tools.len(), 43);
+        assert!(names.contains(&"git_status"));
        assert!(names.contains(&"git_diff"));
        assert!(names.contains(&"git_add"));
        assert!(names.contains(&"git_commit"));
        assert!(names.contains(&"git_log"));
        assert_eq!(tools.len(), 48);
    }
    #[test]
--- a/server/src/main.rs
+++ b/server/src/main.rs
@@ -10,6 +10,7 @@ mod io;
 mod llm;
 pub mod log_buffer;
 mod matrix;
 pub mod rebuild;
 pub mod slack;
 mod state;
 mod store;
--- a/server/src/matrix/bot.rs
+++ b/server/src/matrix/bot.rs
@@ -960,6 +960,39 @@ async fn on_room_message(
        return;
    }
    // Check for the rebuild command, which requires async agent and process ops
    // and cannot be handled by the sync command registry.
    if super::rebuild::extract_rebuild_command(
        &user_message,
        &ctx.bot_name,
        ctx.bot_user_id.as_str(),
    )
    .is_some()
    {
        slog!("[matrix-bot] Handling rebuild command from {sender}");
        // Acknowledge immediately — the rebuild may take a while or re-exec.
        let ack = "Rebuilding server… this may take a moment.";
        let ack_html = markdown_to_html(ack);
        if let Ok(msg_id) = ctx.transport.send_message(&room_id_str, ack, &ack_html).await
            && let Ok(event_id) = msg_id.parse()
        {
            ctx.bot_sent_event_ids.lock().await.insert(event_id);
        }
        let response = super::rebuild::handle_rebuild(
            &ctx.bot_name,
            &ctx.project_root,
            &ctx.agents,
        )
        .await;
        let html = markdown_to_html(&response);
        if let Ok(msg_id) = ctx.transport.send_message(&room_id_str, &response, &html).await
            && let Ok(event_id) = msg_id.parse()
        {
            ctx.bot_sent_event_ids.lock().await.insert(event_id);
        }
        return;
    }
    // Spawn a separate task so the Matrix sync loop is not blocked while we
    // wait for the LLM response (which can take several seconds).
    tokio::spawn(async move {
--- a/server/src/matrix/commands/assign.rs
+++ b/server/src/matrix/commands/assign.rs
@@ -0,0 +1,385 @@
 //! Handler for the `assign` command.
 //!
 //! `assign <number> <model>` pre-assigns a coder model (e.g. `opus`, `sonnet`)
 //! to a story before it starts. The assignment persists in the story file's
 //! front matter as `agent: coder-<model>` so that when the pipeline picks up
 //! the story — either via auto-assign or the `start` command — it uses the
 //! assigned model instead of the default.
 use super::CommandContext;
 use crate::io::story_metadata::{parse_front_matter, set_front_matter_field};
 /// All pipeline stage directories to search when finding a work item by number.
 const STAGES: &[&str] = &[
    "1_backlog",
    "2_current",
    "3_qa",
    "4_merge",
    "5_done",
    "6_archived",
 ];
 /// Resolve a model name hint (e.g. `"opus"`) to a full agent name
 /// (e.g. `"coder-opus"`). If the hint already starts with `"coder-"`,
 /// it is returned unchanged to prevent double-prefixing.
 fn resolve_agent_name(model: &str) -> String {
    if model.starts_with("coder-") {
        model.to_string()
    } else {
        format!("coder-{model}")
    }
 }
 pub(super) fn handle_assign(ctx: &CommandContext) -> Option<String> {
    let args = ctx.args.trim();
    // Parse `<number> <model>` from args.
    let (number_str, model_str) = match args.split_once(char::is_whitespace) {
        Some((n, m)) => (n.trim(), m.trim()),
        None => {
            return Some(format!(
                "Usage: `{} assign <number> <model>` (e.g. `assign 42 opus`)",
                ctx.bot_name
            ));
        }
    };
    if number_str.is_empty() || !number_str.chars().all(|c| c.is_ascii_digit()) {
        return Some(format!(
            "Invalid story number `{number_str}`. Usage: `{} assign <number> <model>`",
            ctx.bot_name
        ));
    }
    if model_str.is_empty() {
        return Some(format!(
            "Usage: `{} assign <number> <model>` (e.g. `assign 42 opus`)",
            ctx.bot_name
        ));
    }
    // Find the story file across all pipeline stages.
    let mut found: Option<(std::path::PathBuf, String)> = None;
    'outer: for stage in STAGES {
        let dir = ctx.project_root.join(".storkit").join("work").join(stage);
        if !dir.exists() {
            continue;
        }
        if let Ok(entries) = std::fs::read_dir(&dir) {
            for entry in entries.flatten() {
                let path = entry.path();
                if path.extension().and_then(|e| e.to_str()) != Some("md") {
                    continue;
                }
                if let Some(stem) = path
                    .file_stem()
                    .and_then(|s| s.to_str())
                    .map(|s| s.to_string())
                {
                    let file_num = stem
                        .split('_')
                        .next()
                        .filter(|s| !s.is_empty() && s.chars().all(|c| c.is_ascii_digit()))
                        .unwrap_or("")
                        .to_string();
                    if file_num == number_str {
                        found = Some((path, stem));
                        break 'outer;
                    }
                }
            }
        }
    }
    let (path, story_id) = match found {
        Some(f) => f,
        None => {
            return Some(format!(
                "No story, bug, or spike with number **{number_str}** found."
            ));
        }
    };
    // Read the human-readable name from front matter for the response.
    let story_name = std::fs::read_to_string(&path)
        .ok()
        .and_then(|contents| {
            parse_front_matter(&contents)
                .ok()
                .and_then(|m| m.name)
        })
        .unwrap_or_else(|| story_id.clone());
    let agent_name = resolve_agent_name(model_str);
    // Write `agent: <agent_name>` into the story's front matter.
    let result = std::fs::read_to_string(&path)
        .map_err(|e| format!("Failed to read story file: {e}"))
        .and_then(|contents| {
            let updated = set_front_matter_field(&contents, "agent", &agent_name);
            std::fs::write(&path, &updated)
                .map_err(|e| format!("Failed to write story file: {e}"))
        });
    match result {
        Ok(()) => Some(format!(
            "Assigned **{agent_name}** to **{story_name}** (story {number_str}). \
             The model will be used when the story starts."
        )),
        Err(e) => Some(format!(
            "Failed to assign model to **{story_name}**: {e}"
        )),
    }
 }
 // ---------------------------------------------------------------------------
 // Tests
 // ---------------------------------------------------------------------------
 #[cfg(test)]
 mod tests {
    use crate::agents::AgentPool;
    use std::collections::HashSet;
    use std::sync::{Arc, Mutex};
    use super::super::{CommandDispatch, try_handle_command};
    fn assign_cmd_with_root(root: &std::path::Path, args: &str) -> Option<String> {
        let agents = Arc::new(AgentPool::new_test(3000));
        let ambient_rooms = Arc::new(Mutex::new(HashSet::new()));
        let room_id = "!test:example.com".to_string();
        let dispatch = CommandDispatch {
            bot_name: "Timmy",
            bot_user_id: "@timmy:homeserver.local",
            project_root: root,
            agents: &agents,
            ambient_rooms: &ambient_rooms,
            room_id: &room_id,
        };
        try_handle_command(&dispatch, &format!("@timmy assign {args}"))
    }
    fn write_story_file(root: &std::path::Path, stage: &str, filename: &str, content: &str) {
        let dir = root.join(".storkit/work").join(stage);
        std::fs::create_dir_all(&dir).unwrap();
        std::fs::write(dir.join(filename), content).unwrap();
    }
    // -- registration / help ------------------------------------------------
    #[test]
    fn assign_command_is_registered() {
        use super::super::commands;
        let found = commands().iter().any(|c| c.name == "assign");
        assert!(found, "assign command must be in the registry");
    }
    #[test]
    fn assign_command_appears_in_help() {
        let result = super::super::tests::try_cmd_addressed(
            "Timmy",
            "@timmy:homeserver.local",
            "@timmy help",
        );
        let output = result.unwrap();
        assert!(
            output.contains("assign"),
            "help should list assign command: {output}"
        );
    }
    // -- argument validation ------------------------------------------------
    #[test]
    fn assign_no_args_returns_usage() {
        let tmp = tempfile::TempDir::new().unwrap();
        let output = assign_cmd_with_root(tmp.path(), "").unwrap();
        assert!(
            output.contains("Usage"),
            "no args should show usage: {output}"
        );
    }
    #[test]
    fn assign_missing_model_returns_usage() {
        let tmp = tempfile::TempDir::new().unwrap();
        let output = assign_cmd_with_root(tmp.path(), "42").unwrap();
        assert!(
            output.contains("Usage"),
            "missing model should show usage: {output}"
        );
    }
    #[test]
    fn assign_non_numeric_number_returns_error() {
        let tmp = tempfile::TempDir::new().unwrap();
        let output = assign_cmd_with_root(tmp.path(), "abc opus").unwrap();
        assert!(
            output.contains("Invalid story number"),
            "non-numeric number should return error: {output}"
        );
    }
    // -- story not found ----------------------------------------------------
    #[test]
    fn assign_unknown_story_returns_friendly_message() {
        let tmp = tempfile::TempDir::new().unwrap();
        // Create stage dirs but no matching story.
        for stage in &["1_backlog", "2_current"] {
            std::fs::create_dir_all(tmp.path().join(".storkit/work").join(stage)).unwrap();
        }
        let output = assign_cmd_with_root(tmp.path(), "999 opus").unwrap();
        assert!(
            output.contains("999") && output.contains("found"),
            "not-found message should include number and 'found': {output}"
        );
    }
    // -- successful assignment ----------------------------------------------
    #[test]
    fn assign_writes_agent_field_to_front_matter() {
        let tmp = tempfile::TempDir::new().unwrap();
        write_story_file(
            tmp.path(),
            "1_backlog",
            "42_story_test_feature.md",
            "---\nname: Test Feature\n---\n\n# Story 42\n",
        );
        let output = assign_cmd_with_root(tmp.path(), "42 opus").unwrap();
        assert!(
            output.contains("coder-opus"),
            "confirmation should include resolved agent name: {output}"
        );
        assert!(
            output.contains("Test Feature"),
            "confirmation should include story name: {output}"
        );
        // Verify the file was updated.
        let contents = std::fs::read_to_string(
            tmp.path()
                .join(".storkit/work/1_backlog/42_story_test_feature.md"),
        )
        .unwrap();
        assert!(
            contents.contains("agent: coder-opus"),
            "front matter should contain agent field: {contents}"
        );
    }
    #[test]
    fn assign_with_sonnet_writes_coder_sonnet() {
        let tmp = tempfile::TempDir::new().unwrap();
        write_story_file(
            tmp.path(),
            "2_current",
            "10_story_current.md",
            "---\nname: Current Story\n---\n",
        );
        assign_cmd_with_root(tmp.path(), "10 sonnet").unwrap();
        let contents = std::fs::read_to_string(
            tmp.path()
                .join(".storkit/work/2_current/10_story_current.md"),
        )
        .unwrap();
        assert!(
            contents.contains("agent: coder-sonnet"),
            "front matter should contain agent: coder-sonnet: {contents}"
        );
    }
    #[test]
    fn assign_with_already_prefixed_name_does_not_double_prefix() {
        let tmp = tempfile::TempDir::new().unwrap();
        write_story_file(
            tmp.path(),
            "1_backlog",
            "7_story_small.md",
            "---\nname: Small Story\n---\n",
        );
        let output = assign_cmd_with_root(tmp.path(), "7 coder-opus").unwrap();
        assert!(
            output.contains("coder-opus"),
            "should not double-prefix: {output}"
        );
        assert!(
            !output.contains("coder-coder-opus"),
            "must not double-prefix: {output}"
        );
        let contents = std::fs::read_to_string(
            tmp.path().join(".storkit/work/1_backlog/7_story_small.md"),
        )
        .unwrap();
        assert!(
            contents.contains("agent: coder-opus"),
            "must write coder-opus, not coder-coder-opus: {contents}"
        );
    }
    #[test]
    fn assign_overwrites_existing_agent_field() {
        let tmp = tempfile::TempDir::new().unwrap();
        write_story_file(
            tmp.path(),
            "1_backlog",
            "5_story_existing.md",
            "---\nname: Existing\nagent: coder-sonnet\n---\n",
        );
        assign_cmd_with_root(tmp.path(), "5 opus").unwrap();
        let contents = std::fs::read_to_string(
            tmp.path()
                .join(".storkit/work/1_backlog/5_story_existing.md"),
        )
        .unwrap();
        assert!(
            contents.contains("agent: coder-opus"),
            "should overwrite old agent with new: {contents}"
        );
        assert!(
            !contents.contains("coder-sonnet"),
            "old agent should no longer appear: {contents}"
        );
    }
    #[test]
    fn assign_finds_story_in_any_stage() {
        let tmp = tempfile::TempDir::new().unwrap();
        // Story is in 3_qa/, not backlog.
        write_story_file(
            tmp.path(),
            "3_qa",
            "99_story_in_qa.md",
            "---\nname: In QA\n---\n",
        );
        let output = assign_cmd_with_root(tmp.path(), "99 opus").unwrap();
        assert!(
            output.contains("coder-opus"),
            "should find story in qa stage: {output}"
        );
    }
    // -- resolve_agent_name unit tests --------------------------------------
    #[test]
    fn resolve_agent_name_prefixes_bare_model() {
        assert_eq!(super::resolve_agent_name("opus"), "coder-opus");
        assert_eq!(super::resolve_agent_name("sonnet"), "coder-sonnet");
        assert_eq!(super::resolve_agent_name("haiku"), "coder-haiku");
    }
    #[test]
    fn resolve_agent_name_does_not_double_prefix() {
        assert_eq!(super::resolve_agent_name("coder-opus"), "coder-opus");
        assert_eq!(super::resolve_agent_name("coder-sonnet"), "coder-sonnet");
    }
 }
--- a/server/src/matrix/commands/help.rs
+++ b/server/src/matrix/commands/help.rs
@@ -4,7 +4,9 @@ use super::{commands, CommandContext};
 pub(super) fn handle_help(ctx: &CommandContext) -> Option<String> {
    let mut output = format!("**{} Commands**\n\n", ctx.bot_name);
-    for cmd in commands() {
+    let mut sorted: Vec<_> = commands().iter().collect();
    sorted.sort_by_key(|c| c.name);
    for cmd in sorted {
        output.push_str(&format!("- **{}** — {}\n", cmd.name, cmd.description));
    }
    Some(output)
@@ -75,6 +77,26 @@ mod tests {
        assert!(output.contains("status"), "help should list status command: {output}");
    }
    #[test]
    fn help_output_is_alphabetical() {
        let result = try_cmd_addressed("Timmy", "@timmy:homeserver.local", "@timmy help");
        let output = result.unwrap();
        // Search for **name** (bold markdown) to avoid substring matches in descriptions.
        let mut positions: Vec<(usize, &str)> = commands()
            .iter()
            .map(|c| {
                let marker = format!("**{}**", c.name);
                let pos = output.find(&marker).expect("command must appear in help as **name**");
                (pos, c.name)
            })
            .collect();
        positions.sort_by_key(|(pos, _)| *pos);
        let names_in_order: Vec<&str> = positions.iter().map(|(_, n)| *n).collect();
        let mut sorted = names_in_order.clone();
        sorted.sort();
        assert_eq!(names_in_order, sorted, "commands must appear in alphabetical order");
    }
    #[test]
    fn help_output_includes_ambient() {
        let result = try_cmd_addressed("Timmy", "@timmy:homeserver.local", "@timmy help");
--- a/server/src/matrix/commands/mod.rs
+++ b/server/src/matrix/commands/mod.rs
@@ -6,6 +6,7 @@
 //! as they are added.
 mod ambient;
 mod assign;
 mod cost;
 mod git;
 mod help;
@@ -75,6 +76,11 @@ pub struct CommandContext<'a> {
 /// Add new commands here — they will automatically appear in `help` output.
 pub fn commands() -> &'static [BotCommand] {
    &[
        BotCommand {
            name: "assign",
            description: "Pre-assign a model to a story: `assign <number> <model>` (e.g. `assign 42 opus`)",
            handler: assign::handle_assign,
        },
        BotCommand {
            name: "help",
            description: "Show this list of available commands",
@@ -135,6 +141,11 @@ pub fn commands() -> &'static [BotCommand] {
            description: "Clear the current Claude Code session and start fresh",
            handler: handle_reset_fallback,
        },
        BotCommand {
            name: "rebuild",
            description: "Rebuild the server binary and restart",
            handler: handle_rebuild_fallback,
        },
    ]
 }
@@ -260,6 +271,16 @@ fn handle_reset_fallback(_ctx: &CommandContext) -> Option<String> {
    None
 }
 /// Fallback handler for the `rebuild` command when it is not intercepted by
 /// the async handler in `on_room_message`.  In practice this is never called —
 /// rebuild is detected and handled before `try_handle_command` is invoked.
 /// The entry exists in the registry only so `help` lists it.
 ///
 /// Returns `None` to prevent the LLM from receiving "rebuild" as a prompt.
 fn handle_rebuild_fallback(_ctx: &CommandContext) -> Option<String> {
    None
 }
 // ---------------------------------------------------------------------------
 // Tests
 // ---------------------------------------------------------------------------
--- a/server/src/matrix/mod.rs
+++ b/server/src/matrix/mod.rs
@@ -20,6 +20,7 @@ pub mod commands;
 mod config;
 pub mod delete;
 pub mod htop;
 pub mod rebuild;
 pub mod reset;
 pub mod start;
 pub mod notifications;
--- a/server/src/matrix/rebuild.rs
+++ b/server/src/matrix/rebuild.rs
@@ -0,0 +1,145 @@
 //! Rebuild command: trigger a server rebuild and restart.
 //!
 //! `{bot_name} rebuild` stops all running agents, rebuilds the server binary
 //! with `cargo build`, and re-execs the process with the new binary.  If the
 //! build fails the error is reported back to the room and the server keeps
 //! running.
 use crate::agents::AgentPool;
 use std::path::Path;
 use std::sync::Arc;
 /// A parsed rebuild command.
 #[derive(Debug, PartialEq)]
 pub struct RebuildCommand;
 /// Parse a rebuild command from a raw message body.
 ///
 /// Strips the bot mention prefix and checks whether the command word is
 /// `rebuild`.  Returns `None` when the message is not a rebuild command.
 pub fn extract_rebuild_command(
    message: &str,
    bot_name: &str,
    bot_user_id: &str,
 ) -> Option<RebuildCommand> {
    let stripped = strip_mention(message, bot_name, bot_user_id);
    let trimmed = stripped
        .trim()
        .trim_start_matches(|c: char| !c.is_alphanumeric());
    let cmd = match trimmed.split_once(char::is_whitespace) {
        Some((c, _)) => c,
        None => trimmed,
    };
    if cmd.eq_ignore_ascii_case("rebuild") {
        Some(RebuildCommand)
    } else {
        None
    }
 }
 /// Handle a rebuild command: trigger server rebuild and restart.
 ///
 /// Returns a string describing the outcome.  On build failure the error
 /// message is returned so it can be posted to the room; the server keeps
 /// running.  On success this function never returns (the process re-execs).
 pub async fn handle_rebuild(
    bot_name: &str,
    project_root: &Path,
    agents: &Arc<AgentPool>,
 ) -> String {
    crate::slog!("[matrix-bot] rebuild command received (bot={bot_name})");
    match crate::rebuild::rebuild_and_restart(agents, project_root).await {
        Ok(msg) => msg,
        Err(e) => format!("Rebuild failed: {e}"),
    }
 }
 /// Strip the bot mention prefix from a raw Matrix message body.
 fn strip_mention<'a>(message: &'a str, bot_name: &str, bot_user_id: &str) -> &'a str {
    let trimmed = message.trim();
    if let Some(rest) = strip_prefix_ci(trimmed, bot_user_id) {
        return rest;
    }
    if let Some(localpart) = bot_user_id.split(':').next()
        && let Some(rest) = strip_prefix_ci(trimmed, localpart)
    {
        return rest;
    }
    if let Some(rest) = strip_prefix_ci(trimmed, bot_name) {
        return rest;
    }
    trimmed
 }
 fn strip_prefix_ci<'a>(text: &'a str, prefix: &str) -> Option<&'a str> {
    if text.len() < prefix.len() {
        return None;
    }
    if !text[..prefix.len()].eq_ignore_ascii_case(prefix) {
        return None;
    }
    let rest = &text[prefix.len()..];
    match rest.chars().next() {
        None => Some(rest),
        Some(c) if c.is_alphanumeric() || c == '-' || c == '_' => None,
        _ => Some(rest),
    }
 }
 // ---------------------------------------------------------------------------
 // Tests
 // ---------------------------------------------------------------------------
 #[cfg(test)]
 mod tests {
    use super::*;
    #[test]
    fn extract_with_display_name() {
        let cmd = extract_rebuild_command("Timmy rebuild", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, Some(RebuildCommand));
    }
    #[test]
    fn extract_with_full_user_id() {
        let cmd = extract_rebuild_command(
            "@timmy:home.local rebuild",
            "Timmy",
            "@timmy:home.local",
        );
        assert_eq!(cmd, Some(RebuildCommand));
    }
    #[test]
    fn extract_with_localpart() {
        let cmd = extract_rebuild_command("@timmy rebuild", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, Some(RebuildCommand));
    }
    #[test]
    fn extract_case_insensitive() {
        let cmd = extract_rebuild_command("Timmy REBUILD", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, Some(RebuildCommand));
    }
    #[test]
    fn extract_non_rebuild_returns_none() {
        let cmd = extract_rebuild_command("Timmy help", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, None);
    }
    #[test]
    fn extract_ignores_extra_args() {
        // "rebuild" with trailing text is still a rebuild command
        let cmd = extract_rebuild_command("Timmy rebuild now", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, Some(RebuildCommand));
    }
    #[test]
    fn extract_no_match_returns_none() {
        let cmd = extract_rebuild_command("Timmy status", "Timmy", "@timmy:home.local");
        assert_eq!(cmd, None);
    }
 }
--- a/server/src/matrix/start.rs
+++ b/server/src/matrix/start.rs
@@ -165,6 +165,12 @@ pub async fn handle_start(
                info.agent_name
            )
        }
        Err(e) if e.contains("All coder agents are busy") => {
            format!(
                "**{story_name}** has been queued in `work/2_current/` and will start \
                 automatically when a coder becomes available."
            )
        }
        Err(e) => {
            format!("Failed to start **{story_name}**: {e}")
        }
@@ -312,6 +318,42 @@ mod tests {
        );
    }
    #[tokio::test]
    async fn handle_start_says_queued_not_error_when_all_coders_busy() {
        use crate::agents::{AgentPool, AgentStatus};
        use std::sync::Arc;
        let tmp = tempfile::tempdir().unwrap();
        let project_root = tmp.path();
        let sk = project_root.join(".storkit");
        let backlog = sk.join("work/1_backlog");
        std::fs::create_dir_all(&backlog).unwrap();
        std::fs::write(
            sk.join("project.toml"),
            "[[agent]]\nname = \"coder-1\"\nstage = \"coder\"\n",
        )
        .unwrap();
        std::fs::write(
            backlog.join("356_story_test.md"),
            "---\nname: Test Story\n---\n",
        )
        .unwrap();
        let agents = Arc::new(AgentPool::new_test(3000));
        agents.inject_test_agent("other-story", "coder-1", AgentStatus::Running);
        let response = handle_start("Timmy", "356", None, project_root, &agents).await;
        assert!(
            !response.contains("Failed"),
            "response must not say 'Failed' when coders are busy: {response}"
        );
        assert!(
            response.to_lowercase().contains("queue") || response.to_lowercase().contains("available"),
            "response must mention queued/available state: {response}"
        );
    }
    #[test]
    fn start_command_is_registered() {
        use crate::matrix::commands::commands;
--- a/server/src/rebuild.rs
+++ b/server/src/rebuild.rs
@@ -0,0 +1,98 @@
 //! Server rebuild and restart logic shared between the MCP tool and Matrix bot command.
 use crate::agents::AgentPool;
 use crate::slog;
 use std::path::Path;
 /// Rebuild the server binary and re-exec.
 ///
 /// 1. Gracefully stops all running agents (kills PTY children).
 /// 2. Runs `cargo build [-p storkit]` from the workspace root, matching
 ///    the current build profile (debug or release).
 /// 3. If the build fails, returns the build error (server stays up).
 /// 4. If the build succeeds, re-execs the process with the new binary via
 ///    `std::os::unix::process::CommandExt::exec()`.
 pub async fn rebuild_and_restart(agents: &AgentPool, project_root: &Path) -> Result<String, String> {
    slog!("[rebuild] Rebuild and restart requested");
    // 1. Gracefully stop all running agents.
    let running_count = agents
        .list_agents()
        .unwrap_or_default()
        .iter()
        .filter(|a| a.status == crate::agents::AgentStatus::Running)
        .count();
    if running_count > 0 {
        slog!("[rebuild] Stopping {running_count} running agent(s) before rebuild");
    }
    agents.kill_all_children();
    // 2. Find the workspace root (parent of the server binary's source).
    //    CARGO_MANIFEST_DIR at compile time points to the `server/` crate;
    //    the workspace root is its parent.
    let manifest_dir = std::path::Path::new(env!("CARGO_MANIFEST_DIR"));
    let workspace_root = manifest_dir
        .parent()
        .ok_or_else(|| "Cannot determine workspace root from CARGO_MANIFEST_DIR".to_string())?;
    slog!(
        "[rebuild] Building server from workspace root: {}",
        workspace_root.display()
    );
    // 3. Build the server binary, matching the current build profile so the
    //    re-exec via current_exe() picks up the new binary.
    let build_args: Vec<&str> = if cfg!(debug_assertions) {
        vec!["build", "-p", "storkit"]
    } else {
        vec!["build", "--release", "-p", "storkit"]
    };
    slog!("[rebuild] cargo {}", build_args.join(" "));
    let output = tokio::task::spawn_blocking({
        let workspace_root = workspace_root.to_path_buf();
        move || {
            std::process::Command::new("cargo")
                .args(&build_args)
                .current_dir(&workspace_root)
                .output()
        }
    })
    .await
    .map_err(|e| format!("Build task panicked: {e}"))?
    .map_err(|e| format!("Failed to run cargo build: {e}"))?;
    if !output.status.success() {
        let stderr = String::from_utf8_lossy(&output.stderr);
        slog!("[rebuild] Build failed:\n{stderr}");
        return Err(format!("Build failed:\n{stderr}"));
    }
    slog!("[rebuild] Build succeeded, re-execing with new binary");
    // 4. Re-exec with the new binary.
    //    Collect current argv so we preserve any CLI arguments (e.g. project path).
    let current_exe =
        std::env::current_exe().map_err(|e| format!("Cannot determine current executable: {e}"))?;
    let args: Vec<String> = std::env::args().collect();
    // Remove the port file before re-exec so the new process can write its own.
    let port_file = project_root.join(".storkit_port");
    if port_file.exists() {
        let _ = std::fs::remove_file(&port_file);
    }
    // Also check cwd for port file.
    let cwd_port_file = std::path::Path::new(".storkit_port");
    if cwd_port_file.exists() {
        let _ = std::fs::remove_file(cwd_port_file);
    }
    // Use exec() to replace the current process.
    // This never returns on success.
    use std::os::unix::process::CommandExt;
    let err = std::process::Command::new(&current_exe)
        .args(&args[1..])
        .exec();
    // If we get here, exec() failed.
    Err(format!("Failed to exec new binary: {err}"))
 }
Author	SHA1	Message	Date
Timmy	0416bf343c	storkit: delete 57_story_live_test_gate_updates	2026-03-21 20:23:45 +00:00
Timmy	c3e4f85903	storkit: done 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting	2026-03-21 20:22:02 +00:00
Timmy	52d9d0f9ce	storkit: done 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting	2026-03-21 20:20:41 +00:00
Timmy	996ba82682	storkit: create 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting	2026-03-21 20:19:56 +00:00
Timmy	1f4152c894	storkit: create 361_story_remove_deprecated_manual_qa_front_matter_field	2026-03-21 19:59:52 +00:00
Timmy	02b481ee4c	storkit: create 359_story_harden_docker_setup_for_security	2026-03-21 19:48:44 +00:00
Timmy	9c339c118f	storkit: create 359_story_harden_docker_setup_for_security	2026-03-21 19:45:26 +00:00
Timmy	4790aac286	storkit: create 359_story_harden_docker_setup_for_security and 360_story_run_storkit_container_under_gvisor_runsc_runtime	2026-03-21 19:43:48 +00:00
Dave	b2d92d6059	storkit: accept 90_story_fetch_real_context_window_size_from_anthropic_models_api	2026-03-21 15:58:15 +00:00
Dave	71887af2d3	storkit: accept 358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases	2026-03-21 15:55:15 +00:00
Dave	5db9965962	storkit: done 358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases	2026-03-21 12:04:11 +00:00
Dave	e109e1ba5c	storkit: merge 358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases	2026-03-21 12:04:08 +00:00
Dave	3554594d8d	storkit: done 90_story_fetch_real_context_window_size_from_anthropic_models_api	2026-03-21 12:01:24 +00:00
Dave	a6c8cf0daf	storkit: merge 90_story_fetch_real_context_window_size_from_anthropic_models_api	2026-03-21 12:01:21 +00:00
Dave	30a56d03e5	storkit: create 358_story_remove_makefile_and_make_script_release_the_single_entry_point_for_releases	2026-03-21 11:55:13 +00:00
Dave	4734bd943f	Fixing release	2026-03-21 11:52:18 +00:00
Dave	a1dd88579b	storkit: accept 344_story_chatgpt_agent_backend_via_openai_api	2026-03-21 03:40:23 +00:00
Dave	759a289894	storkit: done 344_story_chatgpt_agent_backend_via_openai_api	2026-03-20 23:52:24 +00:00
Dave	be3b5b0b60	storkit: merge 344_story_chatgpt_agent_backend_via_openai_api	2026-03-20 23:52:21 +00:00
Dave	fbf391684a	storkit: create 344_story_chatgpt_agent_backend_via_openai_api	2026-03-20 23:39:34 +00:00
Dave	65546a42b7	storkit: accept 343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends	2026-03-20 22:58:45 +00:00
Dave	4e014d45c3	storkit: accept 345_story_gemini_agent_backend_via_google_ai_api	2026-03-20 22:54:45 +00:00
Dave	4f39de437f	storkit: done 345_story_gemini_agent_backend_via_google_ai_api	2026-03-20 22:53:44 +00:00
Dave	79ee6eb0dc	storkit: merge 345_story_gemini_agent_backend_via_google_ai_api	2026-03-20 22:53:41 +00:00
Dave	c930c537bc	storkit: accept 357_story_bot_assign_command_to_pre_assign_a_model_to_a_story	2026-03-20 22:41:00 +00:00
Dave	f129a38704	storkit: done 343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends	2026-03-20 22:07:52 +00:00
Dave	4344081b54	storkit: merge 343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends	2026-03-20 22:07:49 +00:00
Dave	52c5344ce5	storkit: accept 350_story_mcp_tool_for_code_definitions_lookup	2026-03-20 19:30:08 +00:00
Dave	35bd196790	storkit: accept 356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy	2026-03-20 19:09:02 +00:00
Dave	65c8dc19d6	storkit: create 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting	2026-03-20 19:05:18 +00:00
Dave	645a141d2d	storkit: create 343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends	2026-03-20 18:57:52 +00:00
Dave	11d1980920	storkit: done 357_story_bot_assign_command_to_pre_assign_a_model_to_a_story	2026-03-20 18:51:48 +00:00
Dave	83879cfa9e	storkit: merge 357_story_bot_assign_command_to_pre_assign_a_model_to_a_story	2026-03-20 18:51:45 +00:00
Dave	972d8f3c12	storkit: create 357_story_bot_assign_command_to_pre_assign_a_model_to_a_story	2026-03-20 18:40:31 +00:00
Dave	4b1167025c	storkit: accept 355_story_bot_rebuild_command_to_trigger_server_rebuild_and_restart	2026-03-20 16:24:54 +00:00
Dave	23eb752e3b	storkit: accept 354_story_make_help_command_output_alphabetical	2026-03-20 16:22:53 +00:00
Dave	7aa1d0e322	storkit: done 356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy	2026-03-20 16:04:49 +00:00
Dave	a6dcd48da9	storkit: merge 356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy	2026-03-20 16:04:45 +00:00
Dave	87958b0a2a	storkit: done 354_story_make_help_command_output_alphabetical	2026-03-20 15:39:35 +00:00
Dave	ea061d868d	storkit: merge 354_story_make_help_command_output_alphabetical	2026-03-20 15:39:32 +00:00
Dave	6a03ca725e	storkit: done 350_story_mcp_tool_for_code_definitions_lookup	2026-03-20 15:36:30 +00:00
Dave	0cd7c15227	storkit: done 355_story_bot_rebuild_command_to_trigger_server_rebuild_and_restart	2026-03-20 15:30:19 +00:00
Dave	0cb43a4de4	storkit: merge 355_story_bot_rebuild_command_to_trigger_server_rebuild_and_restart	2026-03-20 15:30:16 +00:00
Dave	cb663b620b	storkit: accept 348_story_mcp_tools_for_code_search_grep_and_glob	2026-03-20 15:28:16 +00:00
Dave	0653af701c	storkit: done 348_story_mcp_tools_for_code_search_grep_and_glob	2026-03-20 15:28:09 +00:00
Dave	b1a96990c4	storkit: accept 349_story_mcp_tools_for_git_operations	2026-03-20 15:21:40 +00:00
Dave	e46f855ab3	storkit: done 349_story_mcp_tools_for_git_operations	2026-03-20 15:20:39 +00:00
Dave	d838dd7127	storkit: merge 349_story_mcp_tools_for_git_operations	2026-03-20 15:20:34 +00:00
Dave	02ee48911e	storkit: accept 353_story_add_party_emoji_to_done_stage_notification_messages	2026-03-20 15:18:19 +00:00
Dave	6429b20974	storkit: accept 352_bug_ambient_on_off_command_not_intercepted_by_bot_after_refactors	2026-03-20 15:16:38 +00:00
Dave	dcf0be2998	storkit: create 356_story_start_command_should_say_queued_not_error_when_all_coders_are_busy	2026-03-20 15:08:05 +00:00
Dave	efea81b487	storkit: accept 351_story_bot_reset_command_to_clear_conversation_context	2026-03-20 15:03:49 +00:00
Dave	491ca19a0b	storkit: accept 347_story_mcp_tool_for_shell_command_execution	2026-03-20 13:19:25 +00:00
Dave	243b75e966	storkit: accept 346_story_mcp_tools_for_file_operations_read_write_edit_list	2026-03-20 13:18:24 +00:00
Dave	7693cc820c	storkit: accept 340_story_web_ui_rebuild_and_restart_button	2026-03-20 13:04:21 +00:00
Dave	ba4af4179e	storkit: accept 339_story_web_ui_agent_assignment_dropdown_on_work_items	2026-03-20 12:59:20 +00:00