Files
huskies/server/src/agents/pool/pipeline/merge
Timmy 2758f744f2 fix: reap_stale_merge_jobs re-dispatches instead of just deleting
A mid-merge server restart used to silently kill the merge: the
in-flight tokio task died with the process, reap_stale_merge_jobs ran
on the new boot, saw the Running entry from the previous boot, and
simply deleted it. Mergemaster polling `get_merge_status` then saw
"Merge job disappeared", treated it as a strike, and after three
restarts escalated the story to MergeFailureFinal — even though no
real merge failure ever happened (this is what trapped story 998
during the bug 1001 iteration cycle).

Reap now also fires a `WatcherEvent::WorkItem reassign` for the
cleared story so the auto-assign watcher loop re-runs
start_merge_agent_work on the fresh boot. The story is still in
4_merge/; the merge resumes automatically. The change is contained to
the reap path — start_merge_agent_work's own behaviour is unchanged.

Added regression test
reap_stale_merge_jobs_emits_reassign_watcher_event that asserts the
new event fires. Existing
reap_stale_merge_jobs_removes_old_running_entry_without_merge still
passes (the "without_merge" guarantee is about agent spawning, not
about absence of watcher events).

Also exposes AgentPool::watcher_tx() as pub(crate) so the merge
runner can fan out re-dispatch events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 21:28:10 +01:00
..
2026-04-29 00:16:18 +00:00
2026-04-29 00:16:18 +00:00
2026-05-13 16:30:31 +00:00
2026-04-29 00:16:18 +00:00