run_tests now spawns the child and blocks in a 1-second poll loop until
tests complete or the 20-minute timeout fires. Returns the full result
in a single MCP call — agents use 1 turn instead of 50+. Child process
is properly killed on timeout (no zombies).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.
Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Writes HEAD short hash to .huskies/build_hash after successful cargo
build. Logs it on startup as [startup] Running build: <hash>. No more
guessing whether the rebuild actually deployed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
bypassPermissions ignored the worktree's .claude/settings.json entirely,
letting agents run any Bash command including cargo test (which they'd
spawn 4+ times concurrently, deadlocking on the build directory lock).
allowFullAutoEdit respects the settings.json allowlist, so agents can
only use the Bash commands we explicitly permit (cargo check, cargo
build, git) and must use MCP tools for everything else (run_tests,
run_lint, run_build).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agents were running cargo test directly via Bash instead of using the
run_tests MCP tool, causing 4 concurrent cargo builds that deadlocked
on the build directory lock. Removed cargo test, cargo clippy, cargo
nextest, script/test, npm test, and pnpm test from the allowed Bash
commands. Agents must use the run_tests MCP tool which returns truncated
output and prevents concurrent builds.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The mergemaster agent was burning all 30 turns polling get_merge_status
every 2 seconds while the merge pipeline takes ~2 minutes. It would
exhaust turns, exit, restart, and repeat — never seeing the result.
merge_agent_work now blocks with a 10-second internal poll loop and
returns the final result directly. The agent calls it once and gets
the answer. No more polling turns wasted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The sync_crdt_stages_from_db migration reads pipeline_items (which has
stale 5_done stages) and overwrites the CRDT back to 5_done for stories
that were already swept to 6_archived. On every restart, done stories
reappear and get re-swept.
The migration served its purpose — CRDT stages are now correct. Remove it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Story 535's triage fix was overwritten by a subsequent merge that
resolved a conflict by taking the old filesystem-based version.
Re-applies the CRDT-based triage that reads from pipeline state
and content store, works for any pipeline stage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests shared a global CRDT singleton and content store HashMap, causing
flaky failures when parallel tests wrote items that polluted each
other's assertions. 3-5 random test failures per run.
Both CRDT_STATE and CONTENT_STORE now use thread_local! in test mode
so each test thread gets its own isolated instance. Production code
is unchanged — it still uses the global OnceLock singletons.
Also fixed 3 tests (create_story_writes_correct_content,
next_item_number_increments_from_existing_bugs,
next_item_number_scans_archived_too) that relied on leaked state
from other tests — they now write to the content store explicitly.
Result: 1902 passed, 0 failed across 5 consecutive runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an agent CLI exits without creating a session, we now log:
- Number of prior sessions and total session log bytes
- Child process exit status (exit code or signal)
- Explicit SESSION NONE warning with context
This will help diagnose whether the fatal runtime error
(output.write assertion) correlates with accumulated sessions,
budget exhaustion, or something else.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rate limit auto-scheduler was creating timers for every hard block,
including short 5-minute throttles. This caused a death loop: agent hits
rate limit, timer set, agent exits, pipeline restarts before timer fires,
new agent dies instantly (Session: None) because API is still throttled.
Short rate limits are handled naturally by the CLI's internal wait. Only
schedule timers for long session-level blocks (>10 min) where the CLI
will exit and needs external restart.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
510 stories had stale 1_backlog stages in the CRDT because they were
imported during the filesystem→CRDT migration and then moved forward
via filesystem-only moves that never wrote CRDT ops. This made done
stories appear as ghost entries in the backlog.
On startup, reads the authoritative stage from pipeline_items and
corrects any CRDT entries that disagree.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
read_all_items was iterating all CRDT entries including stale duplicates
from earlier stage writes. A story written multiple times (backlog →
current → done) would appear in the output multiple times with different
stages, causing ghost entries in the pipeline status and backlog views.
Now iterates only the index (story_id → visible_index map) which
represents the latest-wins deduplicated view of each story.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Squash merge of story 504: add MCP regression tests for non-string
front_matter values (arrays, bools, integers). The schema change itself
was already on master. Fixed the array assertion to match YAML's
space-after-comma inline sequence format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The post-520 migration changed validate_story_dirs to read from
pipeline_state::read_all_typed() (the process-global CRDT singleton),
ignoring its root: &Path argument. This broke test isolation — tests
creating a tempdir saw dozens of results from ambient CRDT state,
causing non-deterministic failures that blocked every mergemaster gate.
Remove the CRDT singleton block and rely on the filesystem shadow scan
that already uses the root argument correctly. 1845/1845 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a story is found in the CRDT but not in the expected source stages,
and missing_ok is true, return Ok(None) instead of proceeding with the move.
This prevents promote_ready_backlog_stories from demoting a story that has
already advanced to merge/done via a stale filesystem shadow in 1_backlog.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>