story-kit: queue 159_bug_server_restart_leaves_orphaned_claude_code_pty_processes_running for QA

This commit is contained in:
Dave
2026-02-24 17:53:11 +00:00
parent ee8be90ce5
commit c3a4f0858a

View File

@@ -1,44 +0,0 @@
---
name: "Server restart leaves orphaned Claude Code PTY processes running"
---
# Bug 159: Server restart leaves orphaned Claude Code PTY processes running
## Description
When the server is restarted, existing Claude Code PTY child processes are not killed. They continue running as orphans. The new server instance then starts fresh agents on the same worktrees, causing conflicts — two Claude Code processes fighting over the same worktree, session locks, and files.
## How to Reproduce
1. Start the server with agents running (e.g. a coder on a story)
2. Restart the server (Ctrl+C, then start again)
3. The old Claude Code processes are still alive (check with `ps aux | grep claude`)
4. The new server starts new agent processes on the same stories/worktrees
5. Two Claude Code processes are now running in the same worktree
## Observed Symptoms
- `session_id: null` for minutes after restart (new process can't initialize, possibly because old process holds locks)
- Duplicate PIDs visible for the same story (old zombie + new process)
- Agent may appear stuck or produce garbled output
## Expected Behavior
On server shutdown (or before spawning a new agent on the same worktree), all child PTY processes should be terminated. Two options:
1. **Graceful shutdown**: On SIGTERM/SIGINT, iterate all running agents and kill their PTY child processes before exiting
2. **Startup reconciliation**: On startup, detect and kill any orphaned Claude Code processes running in `.story_kit/worktrees/` before starting new agents
Option 1 is the cleaner approach. Option 2 is a safety net.
## Key Files
- `server/src/agents.rs``AgentPool` holds `task_handle` for each agent's spawned tokio task
- `server/src/llm/providers/claude_code.rs` — PTY process spawning (`run_agent_pty_streaming`)
- `server/src/main.rs` — server startup/shutdown
## Acceptance Criteria
- [ ] Server shutdown kills all child PTY processes before exiting
- [ ] No orphaned Claude Code processes remain after server restart
- [ ] New agent processes start cleanly without competing with zombies