story-kit: queue 134_story_add_process_health_monitoring_and_timeout_to_agent_pty_sessions for QA
This commit is contained in:
@@ -1,21 +0,0 @@
|
||||
---
|
||||
name: "Add process health monitoring and timeout to agent PTY sessions"
|
||||
---
|
||||
|
||||
# Story 134: Add process health monitoring and timeout to agent PTY sessions
|
||||
|
||||
## User Story
|
||||
|
||||
As a user, I want hung or unresponsive agent processes to be detected and cleaned up automatically so that the system recovers without manual intervention.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] The PTY read loop has a configurable inactivity timeout (default 5 minutes) — if no output is received within the timeout, the process is killed and the agent status set to Failed
|
||||
- [ ] A background watchdog task periodically checks that Running agents still have a live process, and marks orphaned entries as Failed
|
||||
- [ ] When an agent process is killed externally (e.g. SIGKILL), the agent status transitions to Failed within the timeout period rather than hanging indefinitely
|
||||
- [ ] A test demonstrates that a hung agent (no PTY output) is killed and marked Failed after the timeout
|
||||
- [ ] A test demonstrates that an externally killed agent is detected and cleaned up by the watchdog
|
||||
|
||||
## Out of Scope
|
||||
|
||||
- TBD
|
||||
Reference in New Issue
Block a user