Lost in commit db00a5d4 when extracting tests from main.rs into cli.rs;
the line range used for the panics_on_duplicate_agent_names test in main.rs
started at the fn signature instead of the attribute line.
The 1258-line main.rs is split into:
- main.rs: mod declarations, async fn main + panics_on_duplicate_agent_names test (894 lines)
- cli.rs: CliArgs struct, parse_cli_args, print_help, resolve_path_arg + their tests (372 lines)
main.rs cannot itself become a directory (binary crate must have main.rs at the
crate root); cli.rs is a sibling module.
No behaviour change. All cli tests pass; full suite green.
In gateway mode the bot has no local CRDT or project filesystem, so all
bot commands (status, backlog, start, assign, etc.) returned empty or
broken results. Now the gateway bot proxies non-local commands via HTTP
to the active project's /api/bot/command endpoint, which already exists
on every project server.
Only a small set of gateway-local commands (help, ambient, reset, switch)
are still handled directly by the gateway. Everything else is forwarded
automatically, so new commands added in the future will work through the
proxy without additional gateway changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cargo fmt without --all fails with "Failed to find targets" in
workspace repos. This was blocking every story's gates. Also ran
cargo fmt --all to fix all existing formatting issues.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Startup now logs "huskies v0.10.0 (build abc1234)" so we can verify
both the version and the commit that's running. build_hash is a
runtime artifact, not tracked in git.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.
Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Writes HEAD short hash to .huskies/build_hash after successful cargo
build. Logs it on startup as [startup] Running build: <hash>. No more
guessing whether the rebuild actually deployed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The sync_crdt_stages_from_db migration reads pipeline_items (which has
stale 5_done stages) and overwrites the CRDT back to 5_done for stories
that were already swept to 6_archived. On every restart, done stories
reappear and get re-swept.
The migration served its purpose — CRDT stages are now correct. Remove it.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
510 stories had stale 1_backlog stages in the CRDT because they were
imported during the filesystem→CRDT migration and then moved forward
via filesystem-only moves that never wrote CRDT ops. This made done
stories appear as ghost entries in the backlog.
On startup, reads the authoritative stage from pipeline_items and
corrects any CRDT entries that disagree.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
read_all_items was iterating all CRDT entries including stale duplicates
from earlier stage writes. A story written multiple times (backlog →
current → done) would appear in the output multiple times with different
stages, causing ghost entries in the pipeline status and backlog views.
Now iterates only the index (story_id → visible_index map) which
represents the latest-wins deduplicated view of each story.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bundles in-progress work from a parallel Claude session toward fixing
bug 501 (rate-limit retry timer doesn't cancel on stop_agent / move_story
/ successful completion). This commit lands the foundation but the MCP
tool wiring is still TODO.
- server/src/chat/timer.rs: defense-in-depth check in tick_once that
skips firing a timer for stories already past 3_qa (3_qa, 4_merge,
5_done, 6_archived). The primary cancellation path will be in the
MCP tools; this guards races where a timer was scheduled before the
story was advanced and the tool didn't get a chance to cancel it.
- server/src/http/context.rs: adds `timer_store: Arc<TimerStore>` field
on AppContext so MCP tools (move_story, stop_agent, ...) can reach
the shared timer store and cancel pending entries when the user
intervenes manually. The test helper is updated to construct one.
- server/src/main.rs: wires up a TimerStore instance in the AppContext
initialiser so the binary actually compiles after the context.rs
field addition. TODO: the matrix bot's spawn_bot still creates its
own TimerStore instance (in chat/transport/matrix/bot/run.rs:220-227)
rather than consuming the shared one — that refactor is the next
step in the bug 501 fix.
What is NOT in this commit and is needed to actually fix bug 501:
- The MCP tool side (move_story, stop_agent, delete_story) does not
yet call timer_store.cancel(story_id) when invoked
- The matrix bot's spawn_bot does not yet consume the shared
timer_store from AppContext — it still creates its own
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Manual squash-merge of feature/story-478_… into master after the in-pipeline
mergemaster runs failed silently. The 478 agent did substantial real work
across multiple respawn cycles before being interrupted; commits on the
feature branch were intact and verified high-quality but never merged via
the normal pipeline path due to compounding bugs:
- The first mergemaster attempt ran ($0.82 in tokens) and exited "Done"
cleanly but didn't push anything to master — likely the worktree was
briefly on master rather than the feature branch when the merge_agent_work
MCP tool ran, so it found nothing to merge.
- Subsequent timer fires defaulted to spawning coders instead of resuming
mergemaster, burning more tokens for no progress.
- Bug 510 (split-brain shadows yanking done stories back to current) and
bug 501 (timers don't cancel on stop/completion) compounded the cost.
What this commit lands:
- server/src/crdt_sync.rs (new, ~518 lines): GET /crdt-sync WebSocket
handler that subscribes to locally-applied SignedOps and streams them as
binary frames. Per-peer bounded queue (256 ops) drops slow peers.
- server/src/crdt_state.rs: new public functions subscribe_ops(),
all_ops_json(), apply_remote_op() backing the sync handler. Adds the
CRDT_OP_TX broadcast channel (capacity 1024).
- server/src/main.rs: wires up the sync subsystem at startup.
- server/src/http/mod.rs: registers the new endpoint.
- server/src/config.rs: adds optional rendezvous field for outbound peers.
- server/src/worktree.rs: minor changes from the original branch.
- server/Cargo.toml: cfg lint suppression for CrdtNode derive.
- crates/bft-json-crdt/src/debug.rs: fix unused-variable warnings.
Resolved a trivial test-mod merge conflict in crdt_state.rs (both 478 and
503 added new tests at the end of the test module — kept both sets).
Note: this is the squash of the original 478 work that the user explicitly
authorized landing. The earlier rogue commit ac9f3ecf — which added a
DIFFERENT, broken implementation of the same feature directly to master
under the user's identity without consent — was reverted earlier in this
session. The forensic tags rogue-commit-2026-04-09-ac9f3ecf and
pre-502-reset-2026-04-09 still exist for incident audit.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 490 merge introduced references to a db::crdt module that doesn't
exist yet (it's part of story 491). Commented out with TODO(491)
markers so master compiles. The crdt_state.rs module from 490 is
intact — these are just the call sites that will be wired up when
491 lands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Marked #[ignore] so cargo test skips it by default. Run manually with
--ignored flag when needed for benchmarking.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CRDT state layer backed by SQLite for pipeline state. Integrates the
BFT JSON CRDT crate with SQLite persistence via sqlx. Ops are persisted
and replayed on startup. Node identity via Ed25519 keypair.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>