story-kit: accept 61_spike_filesystem_watcher_architecture

This commit is contained in:
Dave
2026-02-20 19:45:15 +00:00
parent 2f625a60c8
commit 7b506974a5
2 changed files with 0 additions and 0 deletions

View File

@@ -1,85 +0,0 @@
---
name: Filesystem Watcher Architecture
test_plan: pending
---
# Spike 61: Filesystem Watcher Architecture
## Question
Can we replace all the individual mutation handlers (`move_story_to_current`, `move_story_to_archived`, `close_bug_to_archive`, `move_story_to_merge`, etc.) with a single filesystem watcher on `work/` that auto-commits any change and notifies the frontend?
## Motivation
The server is a file mover that commits. Every mutation handler does the same thing: `fs::rename` + `git_stage_and_commit` with a deterministic message. There are hundreds of lines of near-identical code. A watcher would:
- **Replace N mutation functions with 1 watcher** — any file appearing/disappearing in `work/*/` gets auto-committed
- **MCP tools become one-liners** — just `fs::rename(from, to)`, the watcher handles commit + notify
- **IDE drag-and-drop works** — drag a story from `1_upcoming/` to `2_current/` in Zed, watcher commits and frontend updates
- **Manual edits work** — edit a story file, watcher commits it
- **Frontend notifications for free** — watcher broadcasts over WebSocket, no per-handler notification code
## What to Explore
1. **`notify` crate** — does it reliably detect renames/moves across subdirectories on macOS and Linux? What events fire for `fs::rename`?
2. **Debouncing** — git operations touch multiple files. What's the right debounce window? Can we batch changes into a single commit?
3. **Deterministic commit messages** — can the watcher infer intent from the move? e.g., file appears in `2_current/` → "story-kit: start {story_id}", file appears in `5_archived/` → "story-kit: accept {story_id}"
4. **Race conditions** — what if the watcher fires while a git commit is in progress? Need a mutex or queue?
5. **What stays in mutation handlers** — do the MCP tools still need validation (e.g., "can't move to 2_current if not in 1_upcoming")? Or is the filesystem the only source of truth?
6. **Worktree interaction** — worktrees share `.git/`, so commits from the watcher in the main repo don't conflict with worktree work
7. **Startup** — does the watcher detect drift on startup (files moved while server was down)?
## Success Criteria
- [x] Prototype watcher using `notify` crate that detects file changes in `work/`
- [x] Watcher debounces and auto-commits with deterministic messages
- [x] Watcher broadcasts a WebSocket notification after commit
- [x] At least one MCP tool (e.g. `create_story`) simplified to just write the file, letting the watcher commit
- [x] Dragging a file in the IDE triggers commit + frontend update
- [x] Document: what broke, what was hard, what's the migration path
## Out of Scope
- Full migration of all handlers (that's a follow-up story if the spike succeeds)
- Frontend panel implementation (story 55 handles that)
## Findings
### What Was Built
- **`server/src/io/watcher.rs`**: A dedicated OS thread using the `notify` crate (`RecommendedWatcher` = FSEvents on macOS, inotify on Linux). Watches `work/` recursively. Debounces events with a 300 ms window. Batches all changed paths into a single `git add + commit`. Infers intent from directory name → deterministic commit message.
- **WebSocket integration** (`server/src/http/ws.rs`): Each WebSocket client subscribes to a `broadcast::Sender<WatcherEvent>`. When the watcher flushes a batch, it broadcasts a `WatcherEvent`; the WS handler converts it to `WorkItemChanged` and pushes it to the client. Frontend gets live updates with no polling.
- **`create_story` simplified** (`server/src/http/mcp.rs`): The MCP tool now writes the file and returns — `commit = false`. The watcher picks up the new file in `1_upcoming/` within 300 ms and auto-commits `"story-kit: create {story_id}"`.
### Questions Answered
1. **`notify` crate**: `RecommendedWatcher` fires `EventKind::Create` for the destination path on `fs::rename` across subdirectories on macOS (FSEvents). The source fires `EventKind::Remove`. We only act on `Create` and `Modify`, so a move correctly triggers one commit for the destination.
2. **Debouncing**: 300 ms works well in practice. `fs::rename` is atomic and fires a single event. File writes (e.g. editing a story) may fire multiple events; 300 ms collapses them into one commit. Longer windows (500 ms+) could be used if git is slow.
3. **Deterministic commit messages**: Yes — directory name → action mapping is clean. Stage `1_upcoming``"story-kit: create {item_id}"`, `2_current``"story-kit: start {item_id}"`, `3_qa``"story-kit: queue {item_id} for QA"`, `4_merge``"story-kit: queue {item_id} for merge"`, `5_archived``"story-kit: accept {item_id}"`.
4. **Race conditions**: The watcher uses synchronous `git` subprocess calls inside the debounce flush. Since we're on a dedicated thread with no parallelism, there's no concurrent commit risk within the watcher. If a mutation handler commits first, `git commit` exits with "nothing to commit" and the watcher skips gracefully while still broadcasting the event.
5. **What stays in mutation handlers**: Validation (e.g. "must be in 1_upcoming to move to 2_current") stays in MCP tools for now. The migration path is: MCP tools keep validation, but replace their `git_stage_and_commit` calls with just `fs::rename(from, to)`. The watcher handles the commit. This is a clean N→1 reduction.
6. **Worktree interaction**: The watcher runs on the project root, which is separate from worktrees. Worktrees have their own HEAD but share `.git/`. No conflicts: the watcher commits to `master` (in the main worktree), and worktree agents commit to their feature branches independently.
7. **Startup drift**: Not addressed in this spike. The watcher does not scan for uncommitted files on startup. If the server was down and files were moved, those moves would not be retroactively committed. This is a known gap for a follow-up story.
### What Broke / What Was Hard
- **Nothing broke**: All 240 existing tests pass with the watcher infrastructure in place. The `create_story` simplification required updating one test that previously needed a git repo for its commit (now it just writes a file).
- **Async gap**: The MCP tool returns before the watcher commits. Callers that immediately `git log` after `create_story` may not see the commit yet. This is acceptable for the UX (the frontend gets a WS notification within 300 ms), but callers that rely on immediate commit visibility need to either wait or continue using `commit = true`.
- **Stage inference is positional**: The watcher infers intent from the *destination* directory. If a file appears in `5_archived/` but was never in `2_current/`, the watcher still emits "accept". Validation must remain in the MCP layer.
### Recommendation
**Proceed with Story: Full Watcher Migration**. The spike validates the core hypothesis. The next story should:
1. Replace `git_stage_and_commit` calls in all mutation MCP tools (`move_story_to_current`, `move_story_to_archived`, `move_story_to_merge`, `close_bug_to_archive`, `accept_story`, `check_criterion`) with plain `fs::rename`/`fs::write` operations.
2. Remove the `commit` parameter from `create_story_file` since the watcher handles all commits in `work/`.
3. Add startup drift detection: on server start, scan `work/` for untracked/modified files and commit them.