story-kit: create 280_story_long_running_supervisor_agent_with_periodic_pipeline_polling

2026-03-18 13:51:32 +00:00
parent bc5a3da2c0
commit 945648bf6e
2 changed files with 32 additions and 22 deletions
--- a/.story_kit/work/1_upcoming/280_story_long_running_supervisor_agent_with_periodic_pipeline_polling.md
+++ b/.story_kit/work/1_upcoming/280_story_long_running_supervisor_agent_with_periodic_pipeline_polling.md
@@ -0,0 +1,32 @@
 ---
 name: "Long-running supervisor agent with periodic pipeline polling"
 agent: coder-opus
 ---
 # Story 280: Long-running supervisor agent with periodic pipeline polling
 ## User Story
 As a project owner, I want a long-running supervisor agent (opus) that automatically monitors the pipeline, assigns agents, resolves stuck items, and handles routine operational tasks, so that I don't have to manually check status, kick agents, or babysit the pipeline in every conversation.
 ## Acceptance Criteria
 - [ ] Server can start a persistent supervisor agent that stays alive across the session (not per-story)
 - [ ] Server prods the supervisor periodically (default 30s, configurable in project.toml) with a pipeline status update
 - [ ] Supervisor auto-assigns agents to unassigned items in current/qa/merge stages
 - [ ] Supervisor detects stuck agents (no progress for configurable timeout) and restarts them
 - [ ] Supervisor detects merge failures and sends stories back to current for rebase when appropriate
 - [ ] Supervisor can be chatted with via Matrix (timmy relays to supervisor) or via the web UI
 - [ ] Supervisor logs its decisions so the human can review what it did and why
 - [ ] Polling interval is configurable in project.toml (e.g. supervisor_poll_interval_secs = 30)
 - [ ] Supervisor logs persistent/recurring problems to `.story_kit/problems.md` with timestamp, description, and frequency — humans review this file periodically to create stories for systemic issues
 ## Notes
 - **2026-03-18**: Moved back to current from merge. Previous attempt went through the full pipeline but the squash-merge produced an empty diff — no code was actually implemented. Needs a real implementation.
 ## Out of Scope
 - Supervisor accepting or merging stories to master (human job)
 - Supervisor making architectural decisions
 - Replacing the existing per-story agent spawning — supervisor coordinates on top of it
--- a/.story_kit/work/1_upcoming/286_story_server_self_rebuild_and_restart_via_mcp_tool.md
+++ b/.story_kit/work/1_upcoming/286_story_server_self_rebuild_and_restart_via_mcp_tool.md
@@ -1,22 +0,0 @@
 ---
 name: "Server self-rebuild and restart via MCP tool"
 ---
 # Story 286: Server self-rebuild and restart via MCP tool
 ## User Story
 As a project owner away from my terminal, I want to tell the bot to restart the server so that it picks up new code changes, without needing physical access to the machine.
 ## Acceptance Criteria
 - [ ] MCP tool `rebuild_and_restart` triggers a cargo build of the server
 - [ ] If the build fails, server stays up and returns the build error
 - [ ] If the build succeeds, server re-execs itself with the new binary using std::os::unix::process::CommandExt::exec()
 - [ ] Server logs the restart so it's traceable
 - [ ] Matrix bot reconnects automatically after the server comes back up
 - [ ] Running agents are gracefully stopped before re-exec
 ## Out of Scope
 - TBD