story-kit: queue 134_story_add_process_health_monitoring_and_timeout_to_agent_pty_sessions for QA

2026-02-24 13:09:05 +00:00
parent 997050aee1
commit 92a75215f0
1 changed files with 0 additions and 0 deletions
--- a/.story_kit/work/2_current/134_story_add_process_health_monitoring_and_timeout_to_agent_pty_sessions.md
+++ b/.story_kit/work/2_current/134_story_add_process_health_monitoring_and_timeout_to_agent_pty_sessions.md
@@ -1,21 +0,0 @@
---
-name: "Add process health monitoring and timeout to agent PTY sessions"
---
-
-# Story 134: Add process health monitoring and timeout to agent PTY sessions
-
-## User Story
-
-As a user, I want hung or unresponsive agent processes to be detected and cleaned up automatically so that the system recovers without manual intervention.
-
-## Acceptance Criteria
-
- [ ] The PTY read loop has a configurable inactivity timeout (default 5 minutes) — if no output is received within the timeout, the process is killed and the agent status set to Failed
- [ ] A background watchdog task periodically checks that Running agents still have a live process, and marks orphaned entries as Failed
- [ ] When an agent process is killed externally (e.g. SIGKILL), the agent status transitions to Failed within the timeout period rather than hanging indefinitely
- [ ] A test demonstrates that a hung agent (no PTY output) is killed and marked Failed after the timeout
- [ ] A test demonstrates that an externally killed agent is detected and cleaned up by the watchdog
-
-## Out of Scope
-
- TBD