cargo fmt without --all fails with "Failed to find targets" in
workspace repos. This was blocking every story's gates. Also ran
cargo fmt --all to fix all existing formatting issues.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents can now call get_version to see what server version and commit
they're running against.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests now uses Stdio::inherit so stdout/stderr aren't captured —
tests can only assert on pass/fail and exit code. Tool count bumped
from 59 to 60 for the new get_test_result tool.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove tool_merge_agent_work_returns_started and
tool_get_merge_status_returns_running: these tested the old
non-blocking API but tool_merge_agent_work now blocks in a poll
loop, causing the tests to hang forever.
- Update coder_agents_have_root_cause_guidance: prompt no longer
requires "git bisect" — check for bug workflow guidance instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
spawn() with piped stdout/stderr deadlocks when the test binary
produces more output than the OS pipe buffer (64KB). Switch to
Stdio::inherit so test output flows to server logs and we can
see what's happening.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests now spawns the child and blocks in a 1-second poll loop until
tests complete or the 20-minute timeout fires. Returns the full result
in a single MCP call — agents use 1 turn instead of 50+. Child process
is properly killed on timeout (no zombies).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.
Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bypassPermissions ignored the worktree's .claude/settings.json entirely,
letting agents run any Bash command including cargo test (which they'd
spawn 4+ times concurrently, deadlocking on the build directory lock).
allowFullAutoEdit respects the settings.json allowlist, so agents can
only use the Bash commands we explicitly permit (cargo check, cargo
build, git) and must use MCP tools for everything else (run_tests,
run_lint, run_build).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mergemaster agent was burning all 30 turns polling get_merge_status
every 2 seconds while the merge pipeline takes ~2 minutes. It would
exhaust turns, exit, restart, and repeat — never seeing the result.
merge_agent_work now blocks with a 10-second internal poll loop and
returns the final result directly. The agent calls it once and gets
the answer. No more polling turns wasted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests shared a global CRDT singleton and content store HashMap, causing
flaky failures when parallel tests wrote items that polluted each
other's assertions. 3-5 random test failures per run.
Both CRDT_STATE and CONTENT_STORE now use thread_local! in test mode
so each test thread gets its own isolated instance. Production code
is unchanged — it still uses the global OnceLock singletons.
Also fixed 3 tests (create_story_writes_correct_content,
next_item_number_increments_from_existing_bugs,
next_item_number_scans_archived_too) that relied on leaked state
from other tests — they now write to the content store explicitly.
Result: 1902 passed, 0 failed across 5 consecutive runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
read_all_items was iterating all CRDT entries including stale duplicates
from earlier stage writes. A story written multiple times (backlog →
current → done) would appear in the output multiple times with different
stages, causing ghost entries in the pipeline status and backlog views.
Now iterates only the index (story_id → visible_index map) which
represents the latest-wins deduplicated view of each story.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Squash merge of story 504: add MCP regression tests for non-string
front_matter values (arrays, bools, integers). The schema change itself
was already on master. Fixed the array assertion to match YAML's
space-after-comma inline sequence format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The post-520 migration changed validate_story_dirs to read from
pipeline_state::read_all_typed() (the process-global CRDT singleton),
ignoring its root: &Path argument. This broke test isolation — tests
creating a tempdir saw dozens of results from ambient CRDT state,
causing non-deterministic failures that blocked every mergemaster gate.
Remove the CRDT singleton block and rely on the filesystem shadow scan
that already uses the root argument correctly. 1845/1845 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These changes (HashMap<String, String> → HashMap<String, Value> for front matter,
json_value_to_yaml_scalar, and oneOf schema for front_matter) were left uncommitted
on master after a previous merge, blocking the cherry-pick step of story 509's merge.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the foundational capability to clear a story from the running
server's in-memory CRDT state without restarting the process. This is
story 521, motivated by the 2026-04-09 incident where stories 478 and
503 kept resurrecting from in-memory CRDT after every sqlite delete /
worktree removal / timers.json clear. The only previous remedy was a
full docker restart.
Changes:
- server/src/crdt_state.rs: new `pub fn evict_item(story_id: &str)`.
Looks up the item's CRDT OpId via the visible-index map, calls the
bft-json-crdt list `delete()` primitive to construct a tombstone op,
runs it through the existing `apply_and_persist` machinery (which
signs, applies to the in-memory CRDT, and queues for persistence to
crdt_ops), rebuilds the story_id → visible_index map, and drops the
in-memory CONTENT_STORE entry. The tombstone survives a restart
because it's persisted as a real CRDT op.
- server/src/http/mcp/story_tools.rs: new `tool_purge_story` MCP
handler that takes a story_id and calls evict_item. Deliberately
minimal — does NOT touch agents, worktrees, pipeline_items shadow
table, timers.json, or filesystem shadows. Compose with stop_agent,
remove_worktree, etc. for a full purge. Story 514 (delete_story
full cleanup) is the future "do it all" tool.
- server/src/http/mcp/mod.rs: registers the `purge_story` tool in the
tools list and dispatch table.
Usage:
mcp__huskies__purge_story story_id="<full_story_id>"
Returns a string confirming the eviction. The story will no longer
appear in get_pipeline_status, list_agents, or any other API that
reads from the in-memory CRDT view, and on the next server restart
the persisted tombstone op will keep it from being reconstructed.
This is a prerequisite for story 514 (delete_story full cleanup) and
useful for any "kill it with fire" operator need.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bundles in-progress work from a parallel Claude session toward fixing
bug 501 (rate-limit retry timer doesn't cancel on stop_agent / move_story
/ successful completion). This commit lands the foundation but the MCP
tool wiring is still TODO.
- server/src/chat/timer.rs: defense-in-depth check in tick_once that
skips firing a timer for stories already past 3_qa (3_qa, 4_merge,
5_done, 6_archived). The primary cancellation path will be in the
MCP tools; this guards races where a timer was scheduled before the
story was advanced and the tool didn't get a chance to cancel it.
- server/src/http/context.rs: adds `timer_store: Arc<TimerStore>` field
on AppContext so MCP tools (move_story, stop_agent, ...) can reach
the shared timer store and cancel pending entries when the user
intervenes manually. The test helper is updated to construct one.
- server/src/main.rs: wires up a TimerStore instance in the AppContext
initialiser so the binary actually compiles after the context.rs
field addition. TODO: the matrix bot's spawn_bot still creates its
own TimerStore instance (in chat/transport/matrix/bot/run.rs:220-227)
rather than consuming the shared one — that refactor is the next
step in the bug 501 fix.
What is NOT in this commit and is needed to actually fix bug 501:
- The MCP tool side (move_story, stop_agent, delete_story) does not
yet call timer_store.cancel(story_id) when invoked
- The matrix bot's spawn_bot does not yet consume the shared
timer_store from AppContext — it still creates its own
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Manual squash-merge of feature/story-478_… into master after the in-pipeline
mergemaster runs failed silently. The 478 agent did substantial real work
across multiple respawn cycles before being interrupted; commits on the
feature branch were intact and verified high-quality but never merged via
the normal pipeline path due to compounding bugs:
- The first mergemaster attempt ran ($0.82 in tokens) and exited "Done"
cleanly but didn't push anything to master — likely the worktree was
briefly on master rather than the feature branch when the merge_agent_work
MCP tool ran, so it found nothing to merge.
- Subsequent timer fires defaulted to spawning coders instead of resuming
mergemaster, burning more tokens for no progress.
- Bug 510 (split-brain shadows yanking done stories back to current) and
bug 501 (timers don't cancel on stop/completion) compounded the cost.
What this commit lands:
- server/src/crdt_sync.rs (new, ~518 lines): GET /crdt-sync WebSocket
handler that subscribes to locally-applied SignedOps and streams them as
binary frames. Per-peer bounded queue (256 ops) drops slow peers.
- server/src/crdt_state.rs: new public functions subscribe_ops(),
all_ops_json(), apply_remote_op() backing the sync handler. Adds the
CRDT_OP_TX broadcast channel (capacity 1024).
- server/src/main.rs: wires up the sync subsystem at startup.
- server/src/http/mod.rs: registers the new endpoint.
- server/src/config.rs: adds optional rendezvous field for outbound peers.
- server/src/worktree.rs: minor changes from the original branch.
- server/Cargo.toml: cfg lint suppression for CrdtNode derive.
- crates/bft-json-crdt/src/debug.rs: fix unused-variable warnings.
Resolved a trivial test-mod merge conflict in crdt_state.rs (both 478 and
503 added new tests at the end of the test module — kept both sets).
Note: this is the squash of the original 478 work that the user explicitly
authorized landing. The earlier rogue commit ac9f3ecf — which added a
DIFFERENT, broken implementation of the same feature directly to master
under the user's identity without consent — was reverted earlier in this
session. The forensic tags rogue-commit-2026-04-09-ac9f3ecf and
pre-502-reset-2026-04-09 still exist for incident audit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 490 merge introduced references to a db::crdt module that doesn't
exist yet (it's part of story 491). Commented out with TODO(491)
markers so master compiles. The crdt_state.rs module from 490 is
intact — these are just the call sites that will be wired up when
491 lands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Marked #[ignore] so cargo test skips it by default. Run manually with
--ignored flag when needed for benchmarking.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CRDT state layer backed by SQLite for pipeline state. Integrates the
BFT JSON CRDT crate with SQLite persistence via sqlx. Ops are persisted
and replayed on startup. Node identity via Ed25519 keypair.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rename all references from storkit to huskies across the codebase:
- .storkit/ directory → .huskies/
- Binary name, Cargo package name, Docker image references
- Server code, frontend code, config files, scripts
- Fix script/test to build frontend before cargo clippy/test
so merge worktrees have frontend/dist available for RustEmbed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>