The permission lockdown restricted run_command/run_tests to
.huskies/worktrees/ only. The mergemaster could diagnose merge
conflict compile errors but couldn't edit files in .huskies/merge_workspace/
to fix them. Add merge_workspace as an allowed path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The auto-resolver kept both sides of the conflict — feature's
_project_root signature with master's filesystem code referencing
project_root — producing a compile error. Remove the filesystem
fallback on master so there's no conflict. CRDT is the only source
of truth.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stdio::inherit sent test output to server stdout, making it invisible
to agents calling run_tests via MCP. Switch back to Stdio::piped with
background drain threads (same pattern as gates.rs) to capture output
without the pipe deadlock that caused the original switch to inherit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cargo fmt without --all fails with "Failed to find targets" in
workspace repos. This was blocking every story's gates. Also ran
cargo fmt --all to fix all existing formatting issues.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents can now call get_version to see what server version and commit
they're running against.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Startup now logs "huskies v0.10.0 (build abc1234)" so we can verify
both the version and the commit that's running. build_hash is a
runtime artifact, not tracked in git.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These read-only tools were missing from the locked-down settings,
causing permission prompts to flood Matrix chat for every agent
file read.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use fsync in coverage gate tests to ensure the kernel releases the
write handle before executing the script. Prevents flaky ETXTBSY
errors on fast test runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests now uses Stdio::inherit so stdout/stderr aren't captured —
tests can only assert on pass/fail and exit code. Tool count bumped
from 59 to 60 for the new get_test_result tool.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove tool_merge_agent_work_returns_started and
tool_get_merge_status_returns_running: these tested the old
non-blocking API but tool_merge_agent_work now blocks in a poll
loop, causing the tests to hang forever.
- Update coder_agents_have_root_cause_guidance: prompt no longer
requires "git bisect" — check for bug workflow guidance instead.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
spawn() with piped stdout/stderr deadlocks when the test binary
produces more output than the OS pipe buffer (64KB). Switch to
Stdio::inherit so test output flows to server logs and we can
see what's happening.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The completion handler now pgrep+kills any cargo processes targeting
the worktree's Cargo.toml before running gates. This prevents the
run_tests MCP child from holding the build lock and blocking gates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests now spawns the child and blocks in a 1-second poll loop until
tests complete or the 20-minute timeout fires. Returns the full result
in a single MCP call — agents use 1 turn instead of 50+. Child process
is properly killed on timeout (no zombies).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.
Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Writes HEAD short hash to .huskies/build_hash after successful cargo
build. Logs it on startup as [startup] Running build: <hash>. No more
guessing whether the rebuild actually deployed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bypassPermissions ignored the worktree's .claude/settings.json entirely,
letting agents run any Bash command including cargo test (which they'd
spawn 4+ times concurrently, deadlocking on the build directory lock).
allowFullAutoEdit respects the settings.json allowlist, so agents can
only use the Bash commands we explicitly permit (cargo check, cargo
build, git) and must use MCP tools for everything else (run_tests,
run_lint, run_build).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents were running cargo test directly via Bash instead of using the
run_tests MCP tool, causing 4 concurrent cargo builds that deadlocked
on the build directory lock. Removed cargo test, cargo clippy, cargo
nextest, script/test, npm test, and pnpm test from the allowed Bash
commands. Agents must use the run_tests MCP tool which returns truncated
output and prevents concurrent builds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mergemaster agent was burning all 30 turns polling get_merge_status
every 2 seconds while the merge pipeline takes ~2 minutes. It would
exhaust turns, exit, restart, and repeat — never seeing the result.
merge_agent_work now blocks with a 10-second internal poll loop and
returns the final result directly. The agent calls it once and gets
the answer. No more polling turns wasted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The sync_crdt_stages_from_db migration reads pipeline_items (which has
stale 5_done stages) and overwrites the CRDT back to 5_done for stories
that were already swept to 6_archived. On every restart, done stories
reappear and get re-swept.
The migration served its purpose — CRDT stages are now correct. Remove it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Story 535's triage fix was overwritten by a subsequent merge that
resolved a conflict by taking the old filesystem-based version.
Re-applies the CRDT-based triage that reads from pipeline state
and content store, works for any pipeline stage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests shared a global CRDT singleton and content store HashMap, causing
flaky failures when parallel tests wrote items that polluted each
other's assertions. 3-5 random test failures per run.
Both CRDT_STATE and CONTENT_STORE now use thread_local! in test mode
so each test thread gets its own isolated instance. Production code
is unchanged — it still uses the global OnceLock singletons.
Also fixed 3 tests (create_story_writes_correct_content,
next_item_number_increments_from_existing_bugs,
next_item_number_scans_archived_too) that relied on leaked state
from other tests — they now write to the content store explicitly.
Result: 1902 passed, 0 failed across 5 consecutive runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>