From d74a5eb9f6bee6de1606bdef851db7304fe5da92 Mon Sep 17 00:00:00 2001
From: Dave <futurechimp@users.noreply.github.com>
Date: Tue, 24 Feb 2026 16:21:51 +0000
Subject: [PATCH] story-kit: start
 157_story_make_start_agent_non_blocking_by_deferring_worktree_creation

---
 ...blocking_by_deferring_worktree_creation.md | 55 +++++++++++++++++++
 1 file changed, 55 insertions(+)
 create mode 100644 .story_kit/work/2_current/157_story_make_start_agent_non_blocking_by_deferring_worktree_creation.md

diff --git a/.story_kit/work/2_current/157_story_make_start_agent_non_blocking_by_deferring_worktree_creation.md b/.story_kit/work/2_current/157_story_make_start_agent_non_blocking_by_deferring_worktree_creation.md
new file mode 100644
index 0000000..eb401e9
--- /dev/null
+++ b/.story_kit/work/2_current/157_story_make_start_agent_non_blocking_by_deferring_worktree_creation.md
@@ -0,0 +1,55 @@
+---
+name: "Make start_agent non-blocking by deferring worktree creation"
+---
+
+# Story 157: Make start_agent non-blocking by deferring worktree creation
+
+## Description
+
+`start_agent()` in `server/src/agents.rs` currently blocks on worktree creation (line 380: `worktree::create_worktree()`) before returning. This means the MCP `start_agent` tool call takes 10-30 seconds to respond, during which the web UI chat agent is frozen waiting for the result. The user experiences this as the chat being unresponsive when they ask it to start a coder on something.
+
+## Current Flow (blocking)
+
+1. Register agent as Pending in HashMap (fast)
+2. `move_story_to_current()` (fast — file move + git commit)
+3. **`worktree::create_worktree()` (SLOW — git checkout, mkdir, possibly pnpm install)**
+4. Update agent with worktree info
+5. `tokio::spawn` the agent process (fire-and-forget)
+6. Return result to caller
+
+## Desired Flow (non-blocking)
+
+1. Register agent as Pending in HashMap (fast)
+2. `move_story_to_current()` (fast)
+3. Return immediately with `{"status": "pending", ...}`
+4. Inside the existing `tokio::spawn` (line 416), do worktree creation FIRST, then launch the agent process
+
+## Key Changes
+
+In `server/src/agents.rs` `start_agent()` (line 260):
+
+1. Move the worktree creation block (lines 379-388) and the agent config/prompt rendering (lines 391-398) into the `tokio::spawn` block (line 416), before `run_agent_pty_streaming`
+2. The spawn already transitions status to "running" — add worktree creation before that transition
+3. If worktree creation fails inside the spawn, emit an Error event and set status to Failed (the `PendingGuard` pattern may need adjustment since it currently lives outside the spawn)
+4. Return from `start_agent()` right after step 2 with the Pending status and no worktree info yet
+
+## Error Handling
+
+The `PendingGuard` (line 368) currently cleans up the HashMap entry if `start_agent` fails before reaching the spawn. With the new flow, the guard logic needs to move inside the spawn since that's where failures can now happen (worktree creation, config rendering). If worktree creation fails in the spawn, it should:
+- Send an `AgentEvent::Error` so the UI knows
+- Set status to Failed in the HashMap
+- NOT leave a stale Pending entry
+
+## Key Files
+
+- `server/src/agents.rs` line 260: `start_agent()` — main function to restructure
+- `server/src/agents.rs` line 380: `worktree::create_worktree()` — the blocking call to move into spawn
+- `server/src/agents.rs` line 416: existing `tokio::spawn` block — expand to include worktree creation
+
+## Acceptance Criteria
+
+- [ ] `start_agent` MCP tool returns within 1-2 seconds (no waiting for worktree)
+- [ ] Agent transitions Pending → Running after worktree is created in background
+- [ ] If worktree creation fails, agent status becomes Failed with error message
+- [ ] No stale Pending entries left in HashMap on failure
+- [ ] Existing agent functionality unchanged (worktree created, agent runs, events stream)