storkit: create 452_bug_claude_code_pty_crashes_with_fatal_runtime_error_on_agent_restart

This commit is contained in:
dave
2026-03-31 11:30:44 +00:00
parent 19cc684433
commit 2b0b08ceda
@@ -19,19 +19,18 @@ The fix should include:
## How to Reproduce
Run several agent sessions (or trigger gate failures that cause restarts). After enough sessions, zombie `[claude] <defunct>` processes accumulate. Check with `ps aux | grep '<defunct>'`. Once enough zombies exist, new agent spawns crash immediately with the fatal runtime error.
Run several agent sessions. Check with `ps -eo stat,comm | grep Z | awk '{print $2}' | sort | uniq -c | sort -rn`.
## Actual Result
`fatal runtime error: assertion failed: output.write(&bytes).is_ok(), aborting` — process exits instantly, session is None, burns retries. `ps aux` shows many `[claude] <defunct>` zombie processes.
Zombie processes accumulate continuously. Never reaped.
## Expected Result
All child processes should be properly reaped after exit. No zombie accumulation. New agent spawns should always succeed regardless of how many previous sessions have run.
No zombie accumulation during normal operation.
## Acceptance Criteria
- [ ] Zombie processes do not accumulate during normal operation (including grandchildren from npm/cargo)
- [ ] `child.wait()` is called after `child.kill()` in all code paths in `claude_code.rs`
- [ ] Agent restart after gate failure does not crash the PTY
- [ ] Verified with `ps aux | grep '<defunct>'` after running multiple agent sessions