Agents were spending entire $5 budgets grepping the codebase and reading
git history instead of making fixes when the story already specifies
exact file paths and function names. Changed bug workflow from
"investigate root cause first" to "trust the story, act fast" — go
directly to the specified location when the story tells you where.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The sync_crdt_stages_from_db migration reads pipeline_items (which has
stale 5_done stages) and overwrites the CRDT back to 5_done for stories
that were already swept to 6_archived. On every restart, done stories
reappear and get re-swept.
The migration served its purpose — CRDT stages are now correct. Remove it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Story 535's triage fix was overwritten by a subsequent merge that
resolved a conflict by taking the old filesystem-based version.
Re-applies the CRDT-based triage that reads from pipeline state
and content store, works for any pipeline stage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests shared a global CRDT singleton and content store HashMap, causing
flaky failures when parallel tests wrote items that polluted each
other's assertions. 3-5 random test failures per run.
Both CRDT_STATE and CONTENT_STORE now use thread_local! in test mode
so each test thread gets its own isolated instance. Production code
is unchanged — it still uses the global OnceLock singletons.
Also fixed 3 tests (create_story_writes_correct_content,
next_item_number_increments_from_existing_bugs,
next_item_number_scans_archived_too) that relied on leaked state
from other tests — they now write to the content store explicitly.
Result: 1902 passed, 0 failed across 5 consecutive runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents were running script/test directly through the PTY, streaming
the full output of npm install, cargo clippy, cargo test, and frontend
builds into session logs. This tripled session log sizes (~200KB to
~600KB per session) and contributed to CLI SIGABRT crashes.
The run_tests MCP tool already runs script/test server-side and returns
a truncated JSON summary. Agents now use it exclusively.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an agent CLI exits without creating a session, we now log:
- Number of prior sessions and total session log bytes
- Child process exit status (exit code or signal)
- Explicit SESSION NONE warning with context
This will help diagnose whether the fatal runtime error
(output.write assertion) correlates with accumulated sessions,
budget exhaustion, or something else.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rate limit auto-scheduler was creating timers for every hard block,
including short 5-minute throttles. This caused a death loop: agent hits
rate limit, timer set, agent exits, pipeline restarts before timer fires,
new agent dies instantly (Session: None) because API is still throttled.
Short rate limits are handled naturally by the CLI's internal wait. Only
schedule timers for long session-level blocks (>10 min) where the CLI
will exit and needs external restart.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When replaying old CRDT ops that predate new struct fields (e.g.
claimed_by, claim_ts added by story 479), node_from would call
.unwrap() on None and panic during init. Now defaults to an empty
CrdtNode::new() for missing fields, allowing schema evolution without
breaking replay of historical ops.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
510 stories had stale 1_backlog stages in the CRDT because they were
imported during the filesystem→CRDT migration and then moved forward
via filesystem-only moves that never wrote CRDT ops. This made done
stories appear as ghost entries in the backlog.
On startup, reads the authoritative stage from pipeline_items and
corrects any CRDT entries that disagree.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
read_all_items was iterating all CRDT entries including stale duplicates
from earlier stage writes. A story written multiple times (backlog →
current → done) would appear in the output multiple times with different
stages, causing ghost entries in the pipeline status and backlog views.
Now iterates only the index (story_id → visible_index map) which
represents the latest-wins deduplicated view of each story.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Squash merge of story 504: add MCP regression tests for non-string
front_matter values (arrays, bools, integers). The schema change itself
was already on master. Fixed the array assertion to match YAML's
space-after-comma inline sequence format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The post-520 migration changed validate_story_dirs to read from
pipeline_state::read_all_typed() (the process-global CRDT singleton),
ignoring its root: &Path argument. This broke test isolation — tests
creating a tempdir saw dozens of results from ambient CRDT state,
causing non-deterministic failures that blocked every mergemaster gate.
Remove the CRDT singleton block and rely on the filesystem shadow scan
that already uses the root argument correctly. 1845/1845 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a story is found in the CRDT but not in the expected source stages,
and missing_ok is true, return Ok(None) instead of proceeding with the move.
This prevents promote_ready_backlog_stories from demoting a story that has
already advanced to merge/done via a stale filesystem shadow in 1_backlog.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
These changes (HashMap<String, String> → HashMap<String, Value> for front matter,
json_value_to_yaml_scalar, and oneOf schema for front_matter) were left uncommitted
on master after a previous merge, blocking the cherry-pick step of story 509's merge.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the foundational capability to clear a story from the running
server's in-memory CRDT state without restarting the process. This is
story 521, motivated by the 2026-04-09 incident where stories 478 and
503 kept resurrecting from in-memory CRDT after every sqlite delete /
worktree removal / timers.json clear. The only previous remedy was a
full docker restart.
Changes:
- server/src/crdt_state.rs: new `pub fn evict_item(story_id: &str)`.
Looks up the item's CRDT OpId via the visible-index map, calls the
bft-json-crdt list `delete()` primitive to construct a tombstone op,
runs it through the existing `apply_and_persist` machinery (which
signs, applies to the in-memory CRDT, and queues for persistence to
crdt_ops), rebuilds the story_id → visible_index map, and drops the
in-memory CONTENT_STORE entry. The tombstone survives a restart
because it's persisted as a real CRDT op.
- server/src/http/mcp/story_tools.rs: new `tool_purge_story` MCP
handler that takes a story_id and calls evict_item. Deliberately
minimal — does NOT touch agents, worktrees, pipeline_items shadow
table, timers.json, or filesystem shadows. Compose with stop_agent,
remove_worktree, etc. for a full purge. Story 514 (delete_story
full cleanup) is the future "do it all" tool.
- server/src/http/mcp/mod.rs: registers the `purge_story` tool in the
tools list and dispatch table.
Usage:
mcp__huskies__purge_story story_id="<full_story_id>"
Returns a string confirming the eviction. The story will no longer
appear in get_pipeline_status, list_agents, or any other API that
reads from the in-memory CRDT view, and on the next server restart
the persisted tombstone op will keep it from being reconstructed.
This is a prerequisite for story 514 (delete_story full cleanup) and
useful for any "kill it with fire" operator need.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>