feat(521): evict_item primitive + purge_story MCP tool

Adds the foundational capability to clear a story from the running
server's in-memory CRDT state without restarting the process. This is
story 521, motivated by the 2026-04-09 incident where stories 478 and
503 kept resurrecting from in-memory CRDT after every sqlite delete /
worktree removal / timers.json clear. The only previous remedy was a
full docker restart.

Changes:

  - server/src/crdt_state.rs: new `pub fn evict_item(story_id: &str)`.
    Looks up the item's CRDT OpId via the visible-index map, calls the
    bft-json-crdt list `delete()` primitive to construct a tombstone op,
    runs it through the existing `apply_and_persist` machinery (which
    signs, applies to the in-memory CRDT, and queues for persistence to
    crdt_ops), rebuilds the story_id → visible_index map, and drops the
    in-memory CONTENT_STORE entry. The tombstone survives a restart
    because it's persisted as a real CRDT op.

  - server/src/http/mcp/story_tools.rs: new `tool_purge_story` MCP
    handler that takes a story_id and calls evict_item. Deliberately
    minimal — does NOT touch agents, worktrees, pipeline_items shadow
    table, timers.json, or filesystem shadows. Compose with stop_agent,
    remove_worktree, etc. for a full purge. Story 514 (delete_story
    full cleanup) is the future "do it all" tool.

  - server/src/http/mcp/mod.rs: registers the `purge_story` tool in the
    tools list and dispatch table.

Usage:

    mcp__huskies__purge_story story_id="<full_story_id>"

Returns a string confirming the eviction. The story will no longer
appear in get_pipeline_status, list_agents, or any other API that
reads from the in-memory CRDT view, and on the next server restart
the persisted tombstone op will keep it from being reconstructed.

This is a prerequisite for story 514 (delete_story full cleanup) and
useful for any "kill it with fire" operator need.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Timmy
2026-04-09 21:29:09 +01:00
parent 13635b01bc
commit 1d9287389a
3 changed files with 110 additions and 0 deletions
+63
View File
@@ -500,6 +500,69 @@ pub fn read_item(story_id: &str) -> Option<PipelineItemView> {
extract_item_view(&state.crdt.doc.items[idx])
}
/// Mark a story as deleted in the in-memory CRDT and persist a tombstone op.
///
/// This is the eviction primitive for story 521 — it lets external callers
/// (e.g. the `purge_story` MCP tool, or operator scripts during incident
/// response) clear an item from the running server's in-memory state
/// without needing a full process restart.
///
/// Specifically:
/// 1. Looks up the item's CRDT `OpId` via the visible-index map.
/// 2. Constructs a delete op via the bft-json-crdt list `delete()` primitive.
/// 3. Signs it with the local node's keypair and applies it to the in-memory
/// CRDT (marking the item `is_deleted = true` so subsequent
/// `read_all_items` / `read_item` calls skip it).
/// 4. Persists the signed delete op to `crdt_ops` via the existing
/// `apply_and_persist` channel — so the eviction survives a restart.
/// 5. Rebuilds the `story_id → visible_index` map (visible indices shift
/// when an item is marked deleted).
/// 6. Drops the in-memory content-store entry for the story so the cached
/// body doesn't outlive the CRDT entry.
///
/// Returns `Ok(())` if the item was found and a tombstone op was queued,
/// or an `Err` if the CRDT layer isn't initialised or the story_id is
/// unknown to the in-memory state.
pub fn evict_item(story_id: &str) -> Result<(), String> {
let state_mutex = CRDT_STATE
.get()
.ok_or_else(|| "CRDT layer not initialised".to_string())?;
let mut state = state_mutex
.lock()
.map_err(|e| format!("CRDT lock poisoned: {e}"))?;
let idx = state
.index
.get(story_id)
.copied()
.ok_or_else(|| format!("Story '{story_id}' not found in in-memory CRDT"))?;
// Resolve the item's OpId before the closure (the closure will mutably
// borrow `state`, so we can't access `state.crdt.doc.items` from inside).
let item_id = state
.crdt
.doc
.items
.id_at(idx)
.ok_or_else(|| format!("Item index {idx} for '{story_id}' did not resolve to an OpId"))?;
// Write the delete op via the existing apply_and_persist machinery.
// This signs the op, applies it to the in-memory CRDT (marking the item
// is_deleted), and sends it to the persistence task.
apply_and_persist(&mut state, |s| s.crdt.doc.items.delete(item_id));
// Rebuild the story_id → visible_index map; the deleted item is no
// longer counted by the iter that rebuild_index uses.
state.index = rebuild_index(&state.crdt);
// Drop the content-store entry so the cached body doesn't outlive the
// CRDT entry. (Bug 521 follow-up: when CONTENT_STORE becomes a true
// lazy cache, this explicit eviction can go away.)
crate::db::delete_content(story_id);
Ok(())
}
/// Extract a `PipelineItemView` from a `PipelineItemCrdt`.
fn extract_item_view(item: &PipelineItemCrdt) -> Option<PipelineItemView> {
let story_id = match item.story_id.view() {
+16
View File
@@ -993,6 +993,20 @@ fn handle_tools_list(id: Option<Value>) -> JsonRpcResponse {
"required": ["story_id"]
}
},
{
"name": "purge_story",
"description": "Write a CRDT tombstone op for a story (story 521). Marks the in-memory CRDT item as deleted, persists the tombstone to crdt_ops so it survives restart, and drops the in-memory content store entry. Does NOT touch running agents, worktrees, the pipeline_items shadow table, timers.json, or filesystem shadows — compose with stop_agent / remove_worktree / etc. for a full cleanup. Use this when a story has gone zombie in the running server's in-memory state and direct sqlite deletes alone are not enough to clear it.",
"inputSchema": {
"type": "object",
"properties": {
"story_id": {
"type": "string",
"description": "Work item identifier (filename stem, e.g. '28_story_my_feature')"
}
},
"required": ["story_id"]
}
},
{
"name": "move_story",
"description": "Move a work item (story, bug, spike, or refactor) to an arbitrary pipeline stage. Prefer dedicated tools when available: use accept_story to mark items done, move_story_to_merge to queue for merging, or request_qa to trigger QA review. Use move_story only for arbitrary moves that lack a dedicated tool — for example, moving a story back to backlog or recovering a ghost story by moving it back to current.",
@@ -1307,6 +1321,8 @@ async fn handle_tools_call(
"get_token_usage" => diagnostics::tool_get_token_usage(&args, ctx),
// Delete story
"delete_story" => story_tools::tool_delete_story(&args, ctx).await,
// Purge story (CRDT tombstone — story 521)
"purge_story" => story_tools::tool_purge_story(&args, ctx),
// Arbitrary pipeline movement
"move_story" => diagnostics::tool_move_story(&args, ctx),
// Unblock story
+31
View File
@@ -67,6 +67,37 @@ pub(super) fn tool_create_story(args: &Value, ctx: &AppContext) -> Result<String
Ok(format!("Created story: {story_id}"))
}
/// Purge a story from the in-memory CRDT by writing a tombstone op (story 521).
///
/// This is the eviction primitive for the four-state-machine drift problem
/// we hit on 2026-04-09 — when a story gets stuck in the running server's
/// in-memory CRDT and can't be cleared by sqlite deletes alone (because the
/// in-memory state outlives any pipeline_items / crdt_ops manipulation),
/// this tool writes a proper CRDT delete op via `crdt_state::evict_item`.
///
/// The tombstone op:
/// - Marks the in-memory CRDT item as `is_deleted = true` immediately
/// (so subsequent `read_all_items` / `read_item` calls skip it)
/// - Is persisted to `crdt_ops` so the eviction survives a server restart
/// - Drops the in-memory `CONTENT_STORE` entry for the story
///
/// This tool does NOT touch: running agents, worktrees, the `pipeline_items`
/// shadow table, `timers.json`, or filesystem shadows. Compose with
/// `stop_agent`, `remove_worktree`, etc. as needed for a full purge — or
/// see story 514 (delete_story full cleanup) for a future "do it all" tool.
pub(super) fn tool_purge_story(args: &Value, _ctx: &AppContext) -> Result<String, String> {
let story_id = args
.get("story_id")
.and_then(|v| v.as_str())
.ok_or("Missing required argument: story_id")?;
crate::crdt_state::evict_item(story_id)?;
Ok(format!(
"Evicted '{story_id}' from in-memory CRDT (tombstone op persisted to crdt_ops; CONTENT_STORE entry dropped)."
))
}
pub(super) fn tool_validate_stories(ctx: &AppContext) -> Result<String, String> {
let root = ctx.state.get_project_root()?;
let results = validate_story_dirs(&root)?;