Commit Graph

17 Commits

Author SHA1 Message Date
dave 54d9737428 huskies: merge 1060 2026-05-14 19:31:04 +00:00
dave 977b954e98 huskies: merge 1051 2026-05-14 18:04:30 +00:00
Timmy 5e5c5a0e08 revert: remove temporary merge-reap diagnostic logging
Reverts the diagnostic introduced in 91b4e4ff. Will re-add when we
actively debug the disappearance bug again.
2026-05-14 10:57:37 +01:00
Timmy 91b4e4ff7c diag: log merge-reap values to debug disappearance bug
Temporary diagnostic added to reap_stale_merge_jobs to surface the t,
current_boot, and decoded values being compared on every reap pass.
Will revert once the disappearance bug is understood.
2026-05-14 10:42:16 +01:00
Timmy 8b2ba1c810 fix: post-squash compile errors reclassify as semantic merge conflicts
When deterministic-merge produces a clean git squash but the post-squash
compile fails (typical when master gained a Stage payload field after the
feature branch forked — e.g. story 1018 hit `error[E0063]: missing field
plan` after 1010's PlanState landed), the failure is morally a merge
conflict that git's diff3 missed: the conflicting literal lives in a
different file from the type definition that changed on master. Routing
it as GatesFailed left mergemaster idle and the story stuck.

Changes:
- gates.rs GateFailureKind::classify: detect rustc compile errors
  (`error[E\d+]`) as Build instead of falling through to Test. Clippy
  errors (`error[clippy::...]`) still classify as Lint.
- agents/merge/mod.rs: new MergeResult::to_merge_failure_kind() method.
  GateFailure with failure_kind=Build maps to ConflictDetected (so the
  existing 998 subscriber auto-spawns mergemaster). Other gate failures
  stay GatesFailed.
- agents/pool/pipeline/merge/runner.rs: replace the inline match with a
  call to the new method.

Tests: 6 new unit tests covering the classifier branch and every
to_merge_failure_kind arm. All 2932 tests pass.
2026-05-14 10:18:33 +01:00
dave ebf58ef224 huskies: merge 1008 2026-05-14 08:46:16 +00:00
Timmy 2758f744f2 fix: reap_stale_merge_jobs re-dispatches instead of just deleting
A mid-merge server restart used to silently kill the merge: the
in-flight tokio task died with the process, reap_stale_merge_jobs ran
on the new boot, saw the Running entry from the previous boot, and
simply deleted it. Mergemaster polling `get_merge_status` then saw
"Merge job disappeared", treated it as a strike, and after three
restarts escalated the story to MergeFailureFinal — even though no
real merge failure ever happened (this is what trapped story 998
during the bug 1001 iteration cycle).

Reap now also fires a `WatcherEvent::WorkItem reassign` for the
cleared story so the auto-assign watcher loop re-runs
start_merge_agent_work on the fresh boot. The story is still in
4_merge/; the merge resumes automatically. The change is contained to
the reap path — start_merge_agent_work's own behaviour is unchanged.

Added regression test
reap_stale_merge_jobs_emits_reassign_watcher_event that asserts the
new event fires. Existing
reap_stale_merge_jobs_removes_old_running_entry_without_merge still
passes (the "without_merge" guarantee is about agent spawning, not
about absence of watcher events).

Also exposes AgentPool::watcher_tx() as pub(crate) so the merge
runner can fan out re-dispatch events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 21:28:10 +01:00
dave c3c9db3d8b huskies: merge 987 2026-05-13 16:30:31 +00:00
dave 430079ecbc huskies: merge 986 2026-05-13 16:01:51 +00:00
dave 91fbad568a huskies: merge 982 2026-05-13 15:34:41 +00:00
dave 4b18c01835 huskies: merge 973 2026-05-13 14:08:05 +00:00
dave e9a7468d8a huskies: merge 981 2026-05-13 14:01:02 +00:00
dave a34c9796b5 huskies: merge 913 2026-05-12 15:30:23 +00:00
dave 1d86202abb huskies: merge 868 2026-04-29 23:34:24 +00:00
dave 9a3f60d5d3 huskies: merge 866 2026-04-29 22:47:53 +00:00
dave dcd695ad0e huskies: merge 852 2026-04-29 08:55:49 +00:00
dave 89bf4ae0cf huskies: merge 831 2026-04-29 00:16:18 +00:00