story-kit: accept 96_story_reset_agent_lozenge_to_idle_state_when_returning_to_roster

2026-02-23 20:52:06 +00:00
parent 7f18542c09
commit bed46fea1b
13 changed files with 627 additions and 48 deletions
--- a/.story_kit/work/5_archived/89_story_persistent_per_session_agent_logs.md
+++ b/.story_kit/work/5_archived/89_story_persistent_per_session_agent_logs.md
@@ -0,0 +1,42 @@
+---
+name: "Persistent per-session agent logs"
+---
+
+# Story 89: Persistent per-session agent logs
+
+## User Story
+
+As a user, I want each agent session to write its output to a persistent log file so I can inspect what an agent did after it completes, even across server restarts.
+
+## Acceptance Criteria
+
+- [ ] Each agent session writes output to a log file at .story_kit/logs/{story_id}/{agent_name}-{session_id}.log
+- [ ] Log files persist across server restarts and agent completions
+- [ ] The get_agent_output MCP tool falls back to reading the log file when the in-memory stream is empty or the agent has completed
+- [ ] Log files include timestamps, tool calls, text output, and status events
+- [ ] Different sessions for the same agent on the same story produce separate log files (no mixing)
+
+## Test Plan
+
+### Unit Tests (server/src/agent_log.rs)
+
+1. **test_log_writer_creates_directory_and_file** — AC1: Verify `AgentLogWriter::new()` creates `.story_kit/logs/{story_id}/` and the log file `{agent_name}-{session_id}.log`.
+2. **test_log_writer_writes_jsonl_with_timestamps** — AC4: Verify `write_event()` writes valid JSONL with ISO 8601 timestamps including type, data, and timestamp fields.
+3. **test_read_log_parses_written_events** — AC3/AC4: Verify `read_log()` can round-trip events written by `write_event()`.
+4. **test_separate_sessions_produce_separate_files** — AC5: Verify two `AgentLogWriter` instances for the same story_id+agent_name but different session_ids write to different files without mixing.
+5. **test_find_latest_log_returns_most_recent** — AC3: Verify `find_latest_log()` returns the correct most-recent log for a given story_id+agent_name pair.
+6. **test_log_files_persist_on_disk** — AC2: Verify that after writer is dropped, the file still exists and is readable.
+
+### Unit Tests (server/src/agents.rs)
+
+7. **test_emit_event_writes_to_log_writer** — AC1/AC4: Verify that `emit_event` with a log writer writes to the log file in addition to broadcast+event_log.
+
+### Integration Tests (server/src/http/mcp.rs)
+
+8. **test_get_agent_output_falls_back_to_log_file** — AC3: Verify that when in-memory events are empty and agent is completed, `get_agent_output` reads from the log file.
+
+## Out of Scope
+
+- Log rotation or cleanup of old log files
+- Frontend UI for viewing log files
+- Log file compression
--- a/.story_kit/work/5_archived/93_story_expose_server_logs_to_agents_via_mcp.md
+++ b/.story_kit/work/5_archived/93_story_expose_server_logs_to_agents_via_mcp.md
@@ -0,0 +1,35 @@
+---
+name: "Expose server logs to agents via MCP"
+---
+
+# Story 93: Expose server logs to agents via MCP
+
+## User Story
+
+As a coder agent, I want to read recent server logs via an MCP tool, so that I can verify runtime behavior (WebSocket events, MCP call flow, PTY interactions) without relying on a human to check.
+
+## Context
+
+Agents currently have no visibility into runtime server behavior. They can run `cargo test` and `cargo clippy`, but for bugs involving runtime flow (e.g. events not reaching the frontend, MCP tools not triggering), they can't verify their fix works end-to-end. Exposing server logs lets agents self-diagnose issues and confirm runtime behavior matches expectations.
+
+## Approach
+
+- Add a bounded ring buffer (e.g. 1000 lines) that captures `eprintln!` / `tracing` output in-process
+- Expose an MCP tool `get_server_logs(lines?, filter?)` that returns recent log entries
+  - `lines`: number of recent lines to return (default 100)
+  - `filter`: optional substring filter (e.g. `"watcher"`, `"permission"`, `"mcp"`)
+- The ring buffer lives in `AppContext` so it's accessible from the MCP handler
+
+## Acceptance Criteria
+
+- [ ] Server captures log output in a bounded in-memory ring buffer
+- [ ] `get_server_logs` MCP tool exists with optional `lines` and `filter` parameters
+- [ ] Tool returns recent log lines as a newline-separated string
+- [ ] Buffer does not grow unbounded (fixed capacity, oldest entries evicted)
+- [ ] cargo clippy and cargo test pass
+
+## Out of Scope
+
+- Per-worktree log isolation (agents see the main server's logs)
+- Log levels / structured logging migration (can use raw eprintln output for now)
+- Log persistence to disk
--- a/.story_kit/work/5_archived/96_story_reset_agent_lozenge_to_idle_state_when_returning_to_roster.md
+++ b/.story_kit/work/5_archived/96_story_reset_agent_lozenge_to_idle_state_when_returning_to_roster.md
@@ -0,0 +1,19 @@
+---
+name: "Reset agent lozenge to idle state when returning to roster"
+---
+
+# Story 96: Reset agent lozenge to idle state when returning to roster
+
+## User Story
+
+As a user, I want agent lozenges to return to their idle appearance (grey with green dot) when they move back to the Agents roster, so that the UI accurately reflects which agents are currently active.
+
+## Acceptance Criteria
+
+- [ ] Agent lozenges in the Agents roster always display in idle state (grey background with green dot)
+- [ ] After an agent completes work and returns from current/qa/merge to the roster, its lozenge resets to idle appearance
+- [ ] Lozenges still show amber/green states while assigned to work items in current/qa/merge
+
+## Out of Scope
+
+- TBD
--- a/.story_kit/work/5_archived/97_bug_agent_pool_allows_multiple_instances_of_the_same_agent_to_run_concurrently.md
+++ b/.story_kit/work/5_archived/97_bug_agent_pool_allows_multiple_instances_of_the_same_agent_to_run_concurrently.md
@@ -0,0 +1,28 @@
+---
+name: "Agent pool allows multiple instances of the same agent to run concurrently"
+---
+
+# Bug 97: Agent pool allows multiple instances of the same agent to run concurrently
+
+## Description
+
+The agent pool does not enforce single-instance concurrency per agent name. When the pipeline auto-advances (e.g. coder finishes and QA starts automatically), it spawns a new agent instance even if that agent is already running on another story. With only one QA agent defined in project.toml, two QA processes were observed running simultaneously on different stories, driving load average past 33 on an M1 Mac. Each QA run spawns cargo clippy, cargo test, and vitest — two concurrent runs is brutal on resources.
+
+## How to Reproduce
+
+1. Have one QA agent defined in project.toml\n2. Start a coder on story A — when it completes, pipeline auto-starts QA on story A\n3. While QA is running on story A, start a coder on story B — when it completes, pipeline auto-starts QA on story B\n4. Observe two QA agents running simultaneously despite only one being defined
+
+## Actual Result
+
+Two QA agent processes run concurrently, competing for CPU and causing extreme load.
+
+## Expected Result
+
+The agent pool should enforce single-instance concurrency per agent name. If QA is already running, the next QA job should queue until the current one finishes.
+
+## Acceptance Criteria
+
+- [ ] Agent pool checks if an agent name is already running before spawning a new instance
+- [ ] If the agent is busy, the new job is queued and starts when the agent becomes available
+- [ ] Pipeline auto-advancement respects the concurrency limit
+- [ ] cargo clippy and cargo test pass