Commit Graph

9 Commits

Author SHA1 Message Date
Timmy 8b2ba1c810 fix: post-squash compile errors reclassify as semantic merge conflicts
When deterministic-merge produces a clean git squash but the post-squash
compile fails (typical when master gained a Stage payload field after the
feature branch forked — e.g. story 1018 hit `error[E0063]: missing field
plan` after 1010's PlanState landed), the failure is morally a merge
conflict that git's diff3 missed: the conflicting literal lives in a
different file from the type definition that changed on master. Routing
it as GatesFailed left mergemaster idle and the story stuck.

Changes:
- gates.rs GateFailureKind::classify: detect rustc compile errors
  (`error[E\d+]`) as Build instead of falling through to Test. Clippy
  errors (`error[clippy::...]`) still classify as Lint.
- agents/merge/mod.rs: new MergeResult::to_merge_failure_kind() method.
  GateFailure with failure_kind=Build maps to ConflictDetected (so the
  existing 998 subscriber auto-spawns mergemaster). Other gate failures
  stay GatesFailed.
- agents/pool/pipeline/merge/runner.rs: replace the inline match with a
  call to the new method.

Tests: 6 new unit tests covering the classifier branch and every
to_merge_failure_kind arm. All 2932 tests pass.
2026-05-14 10:18:33 +01:00
dave c3c9db3d8b huskies: merge 987 2026-05-13 16:30:31 +00:00
dave 430079ecbc huskies: merge 986 2026-05-13 16:01:51 +00:00
dave 2a77f73ba4 fix(merge): use server-start-time, not pid, for stale-merge detection
The merge_jobs cleanup encoded the server's pid in the CRDT and checked
`kill(pid, 0)` to decide whether a "running" entry was stale. Two problems:

  1. The cleanup runs *inside* the server, so checking whether the
     server's own pid is alive is tautological — kill(self_pid, 0)
     always succeeds.
  2. `rebuild_and_restart` does an `execve()` re-exec, which keeps the
     same pid. After re-exec, merge_jobs from the previous server
     instance still encode "the current pid" — so the cleanup never
     fires, and stories like 799/800 sit forever with status="running"
     while no actual merge runs.

Switch to a per-process server-start-time captured lazily in a
`OnceLock<f64>` (reset by execve, so the new instance sees a fresh
boot-time). A merge_job's recorded start-time < current boot-time means
it came from a previous instance: stale, delete it.

Legacy pid-encoded entries decode to None and are also treated as stale.

MergeJob.pid → MergeJob.server_start_time. Tests updated.
2026-04-28 20:41:32 +00:00
dave 7faacb6664 huskies: merge 773 2026-04-28 10:24:04 +00:00
dave 7ee542dd1e huskies: merge 757 2026-04-27 23:36:56 +00:00
dave 101f616346 huskies: merge 719_refactor_stale_merge_job_lock_recovery_on_new_merge_attempts 2026-04-27 17:46:49 +00:00
dave b340aa97b0 fix: clean up clippy warnings + cargo fmt across post-refactor surface
The 13-file refactor pass (commits db00a5d4 through eca15b4e) introduced
~89 clippy errors and 38 cargo fmt issues — every agent in every worktree
hit them on script/test, burning their turn budget on cleanup before doing
real story work. This is the silent kill behind 644, 652, 655, 664, 667
all hitting watchdog limits this round.

Changes:
- cargo fmt --all across 37 files (formatting normalisation only)
- #![allow(unused_imports, dead_code)] on 24 split modules where the
  python-script splitter imported liberally to be safe; tighter cleanup
  per-import will happen as agents touch each module
- Removed truly-dead re-exports (cleanup_merge_workspace, slog_warn from
  http/mcp/mod.rs, CliArgs/print_help from main.rs)
- Prefixed _auth_msg in crdt_sync/server.rs (handshake helper return is
  bound but not consumed)
- Converted dangling /// doc block in crdt_sync/mod.rs to //! so it
  attaches to the module
- Removed empty lines after doc comments in 4 spots (clippy lint)

All 2636 tests pass; clippy --all-targets -- -D warnings clean.
2026-04-27 01:32:08 +00:00
dave ce94dd0af4 refactor: split agents/merge.rs into mod.rs + squash.rs + conflicts.rs
The 1772-line merge.rs is split into:

- conflicts.rs: try_resolve_conflicts + resolve_simple_conflicts + tests (351 lines)
- squash.rs: run_squash_merge orchestrator + cleanup + run_merge_quality_gates + tests (1306 lines)
- mod.rs: doc, types (MergeJobStatus, MergeJob, MergeReport, SquashMergeResult), re-exports (52 lines)

Tests stay co-located.

No behaviour change. All 20 merge tests pass; full suite green
(2635 tests with --test-threads=1).
2026-04-26 21:15:06 +00:00