diff --git a/.story_kit/work/1_upcoming/280_story_long_running_supervisor_agent_with_periodic_pipeline_polling.md b/.story_kit/work/1_upcoming/280_story_long_running_supervisor_agent_with_periodic_pipeline_polling.md new file mode 100644 index 0000000..e9e9259 --- /dev/null +++ b/.story_kit/work/1_upcoming/280_story_long_running_supervisor_agent_with_periodic_pipeline_polling.md @@ -0,0 +1,32 @@ +--- +name: "Long-running supervisor agent with periodic pipeline polling" +agent: coder-opus +--- + +# Story 280: Long-running supervisor agent with periodic pipeline polling + +## User Story + +As a project owner, I want a long-running supervisor agent (opus) that automatically monitors the pipeline, assigns agents, resolves stuck items, and handles routine operational tasks, so that I don't have to manually check status, kick agents, or babysit the pipeline in every conversation. + +## Acceptance Criteria + +- [ ] Server can start a persistent supervisor agent that stays alive across the session (not per-story) +- [ ] Server prods the supervisor periodically (default 30s, configurable in project.toml) with a pipeline status update +- [ ] Supervisor auto-assigns agents to unassigned items in current/qa/merge stages +- [ ] Supervisor detects stuck agents (no progress for configurable timeout) and restarts them +- [ ] Supervisor detects merge failures and sends stories back to current for rebase when appropriate +- [ ] Supervisor can be chatted with via Matrix (timmy relays to supervisor) or via the web UI +- [ ] Supervisor logs its decisions so the human can review what it did and why +- [ ] Polling interval is configurable in project.toml (e.g. supervisor_poll_interval_secs = 30) +- [ ] Supervisor logs persistent/recurring problems to `.story_kit/problems.md` with timestamp, description, and frequency — humans review this file periodically to create stories for systemic issues + +## Notes + +- **2026-03-18**: Moved back to current from merge. Previous attempt went through the full pipeline but the squash-merge produced an empty diff — no code was actually implemented. Needs a real implementation. + +## Out of Scope + +- Supervisor accepting or merging stories to master (human job) +- Supervisor making architectural decisions +- Replacing the existing per-story agent spawning — supervisor coordinates on top of it diff --git a/.story_kit/work/1_upcoming/286_story_server_self_rebuild_and_restart_via_mcp_tool.md b/.story_kit/work/1_upcoming/286_story_server_self_rebuild_and_restart_via_mcp_tool.md deleted file mode 100644 index 78ef58b..0000000 --- a/.story_kit/work/1_upcoming/286_story_server_self_rebuild_and_restart_via_mcp_tool.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -name: "Server self-rebuild and restart via MCP tool" ---- - -# Story 286: Server self-rebuild and restart via MCP tool - -## User Story - -As a project owner away from my terminal, I want to tell the bot to restart the server so that it picks up new code changes, without needing physical access to the machine. - -## Acceptance Criteria - -- [ ] MCP tool `rebuild_and_restart` triggers a cargo build of the server -- [ ] If the build fails, server stays up and returns the build error -- [ ] If the build succeeds, server re-execs itself with the new binary using std::os::unix::process::CommandExt::exec() -- [ ] Server logs the restart so it's traceable -- [ ] Matrix bot reconnects automatically after the server comes back up -- [ ] Running agents are gracefully stopped before re-exec - -## Out of Scope - -- TBD