huskies

Author	SHA1	Message	Date
dave	a7bad217eb	huskies: merge 1137 story First-run project init flow — walk through config instead of leaving defaults silently	2026-05-18 12:59:11 +00:00
dave	f2c13c7d29	huskies: merge 1136 story Sled → gateway WebSocket back-channel so project pipeline events reach Timmy	2026-05-18 12:29:50 +00:00
dave	3444ff4e29	huskies: merge 1135 story Bootstrap Claude credentials into newly-launched project sleds	2026-05-18 12:06:32 +00:00
dave	26f4da7ba5	huskies: merge 1134 story mkdir -p ~/.huskies/<name>/ before ssh-keygen in adopt	2026-05-18 11:53:31 +00:00
Timmy	4c6b4f5d4d	fix: project sleds need claude CLI + extensions.worktreeConfig Two issues that surfaced when story 1 ran in the adopted huskies-server sled: 1. Dockerfile.base: the base image had no nodejs / claude CLI, so every coder agent spawn in an adopted project sled failed with `Unable to spawn claude: No viable candidates found in PATH`. Install nodejs + @anthropic-ai/claude-code in the base image so every sled built from it can spawn agents out of the box. 2. worktree/create.rs::install_pre_commit_hook: `git config --worktree` requires `extensions.worktreeConfig = true` to be set on the repo config; without it, every worktree creation logged a noisy `Pre-commit hook install failed` warning. Enable the extension idempotently before the per-worktree hooks-path set so the hook install succeeds cleanly. After this, rebuild huskies-project-base and recreate any adopted project containers to pick up the CLI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 08:40:21 +01:00
dave	70797753df	huskies: merge 1132 story Chat-bot proxy reads stale `gateway_project_urls` BTreeMap instead of live store (1122 missed this seam)	2026-05-18 00:02:37 +00:00
Timmy	ec3216072d	Revert "fix: bind project container host ports to 0.0.0.0" This reverts commit `810c8d4d72`.	2026-05-18 00:28:34 +01:00
Timmy	810c8d4d72	fix: bind project container host ports to 0.0.0.0 Story 1130 added HUSKIES_HOST=0.0.0.0 so the server INSIDE a project container binds to all interfaces, but the host-side `docker -p` mapping was still `127.0.0.1:{port}:3001` and `127.0.0.1:{ssh_port}:22` — reachable from the docker host only, blocking remote MCP clients and out-of-host SSH onto the project container. Switch host-side mapping to 0.0.0.0 for both the MCP and SSH ports so project containers spawned via `new project` are reachable from anywhere that can route to the docker host. Existing containers created before this commit retain their localhost-only mapping and need to be recreated to pick up the change. Add a regression test asserting both -p arguments use 0.0.0.0 and reject any 127.0.0.1 restriction in the mapping. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-18 00:04:32 +01:00
Timmy	42e6eec9e9	Bump version to 0.12.1	2026-05-17 23:46:50 +01:00
dave	fe00fe6a25	huskies: merge 1127 story Migrate all LLM-invoking transports onto assemble_prompt_context; delete legacy Vec	2026-05-17 22:28:01 +00:00
dave	2d0387fe63	huskies: merge 1126 story Gateway event aggregator with per-session scope filters (Timmy=All, Sally=single sled)	2026-05-17 21:02:08 +00:00
dave	d86cc38b2a	huskies: merge 1128 story Bounded event queues + EventStreamGap sentinel + observability for context assembly	2026-05-17 20:30:02 +00:00
dave	badd522d60	huskies: merge 1125 story LLM session entity + assemble_prompt_context helper, wired into Matrix bot	2026-05-17 20:09:33 +00:00
dave	ecd3f600d9	huskies: merge 1130 story Adopted/launched project containers bind huskies to 127.0.0.1, unreachable from host MCP	2026-05-17 20:02:22 +00:00
dave	89058ebd49	huskies: merge 1124 story Persist TransitionFired into a per-sled CRDT event log	2026-05-17 19:37:50 +00:00
dave	d8204ab7ed	huskies: merge 1129 story find_free_port fallback returns unbindable port silently when range is exhausted	2026-05-17 19:24:29 +00:00
dave	c1b7e12b0b	huskies: merge 1122 story Chat-bot switch command reads stale `gateway_projects` Vec instead of live `gateway_projects_store`	2026-05-17 18:49:58 +00:00
dave	6331dea8b0	huskies: merge 1121 story Remove the marketing website from the huskies OSS repo (now lives in huskies-server)	2026-05-17 18:43:43 +00:00
dave	7de167b21b	huskies: merge 1116 story rebuild_and_restart loses pending CRDT ops by calling exec() before persistence channel drains	2026-05-17 17:48:44 +00:00
dave	73cf1c6ff9	huskies: merge 1117 story MCP tool for adopt: expose `new project --adopt` as an MCP call	2026-05-17 16:42:06 +00:00
dave	f8b1e14b74	huskies: merge 1118 story Automate per-project docker image builds (huskies-project-base + per-stack overlays)	2026-05-17 16:30:08 +00:00
Timmy	265e6f9a15	fix(1101): strip passing-test lines before classify() lint check; remove diagnostic The merge gate classifier was matching trigger keywords like `missing_doc_comments` inside passing-test name lines (e.g. `test agents::gates::tests::classify_lint_from_missing_doc_comments ... ok`), causing every gate failure to be mis-classified as Lint and bounced back to a fixup coder. Strip `test … … ok` lines before scanning for lint triggers. Also removes the temporary diagnostic block in runner.rs that confirmed the bug. Applied directly to master because the 1101 feature branch carried stale work from an earlier incarnation of the story that semantically conflicted with master's later diagnostic commit (`is_fixup` deleted on the branch, referenced on master). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 16:52:26 +01:00
dave	0695ad7ae6	huskies: merge 1115 story new project: --adopt flow to wrap a container around an existing checkout	2026-05-17 15:17:12 +00:00
dave	eb6b07531a	huskies: merge 1114 story new project: --path flag to override default host directory	2026-05-17 14:48:49 +00:00
Timmy	a5bfd40233	Bump version to 0.12.0	2026-05-17 02:10:31 +01:00
dave	a40500eea9	huskies: merge 1111 bug Test isolation: `init_for_test()` and `ensure_content_store()` are once-per-thread, not once-per-test, polluting CRDT state across tests	2026-05-17 00:33:45 +00:00
dave	f8212f102f	huskies: merge 1109 story Chat bootstrap Phase 4: `--git` clones an existing repo and configures push credentials	2026-05-17 00:18:25 +00:00
dave	59302b465d	huskies: merge 1108 story Chat bootstrap Phase 3: SSH-remote editor access into the project container (any editor)	2026-05-16 23:37:59 +00:00
dave	efafe44db1	huskies: merge 1110 story Chat bootstrap Phase 2b: additional stack overlays (Go, Python, Ruby, JVM)	2026-05-16 23:20:31 +00:00
dave	3a43337735	huskies: merge 1107 story Chat bootstrap Phase 2a: stack-overlay framework + Rust and Node stack overlays	2026-05-16 23:01:49 +00:00
dave	10d992a7e4	huskies: merge 1106 story Chat bootstrap Phase 1: `new project` chat command spawns a bare project container and registers it with the gateway	2026-05-16 22:39:20 +00:00
Timmy	7db0b78e88	Bump version to 0.11.1	2026-05-15 23:38:09 +01:00
dave	979492449e	huskies: merge 1105 bug Freeze from Backlog stores wrong resume_to — Unfreeze restores to Coding instead of Backlog	2026-05-15 22:33:54 +00:00
Timmy	6fbe239313	fix(1102): require non-empty origin.id on create_* MCP tools bug 1102 was created today with origin={kind:user, id:""} because build_origin silently defaulted id to empty when the caller didn't pass one — we couldn't tell who filed it. Bug 1088's origin field is useless as audit if every caller can omit themselves. Changes: - build_origin (server/src/http/mcp/story_tools/mod.rs) now returns Result<String, String> and rejects missing/empty/whitespace-only id with an instructional error pointing at bug 1102 / story 1104. - 5 create_* tool handlers (bug, spike, refactor, epic, story) now resolve origin BEFORE create_*_file so an attribution-less call leaves no half-state behind. - 5 tool input schemas now advertise origin as a required object via a shared origin_schema() helper. The schema description gives every caller (coder agent, chat bot, user, system) a concrete example so the LLM populates the field correctly on first sight. - Test fixtures pass origin = {kind:"test", id:"test-suite"}. Story 1104 (signed actions) is the longer-term replacement; this is the quick attribution win agreed for master ahead of that design work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 23:13:54 +01:00
Timmy	26527e7dae	diag(1101): log classify verdict + matched trigger on merge gate failures Bug 1101's reframed AC1: when a non-success merge runs, log the typed GateFailureKind, the matched classifier-trigger substring (if any) and ~90 chars of surrounding context. Fires on every gate failure regardless of routing, so the next fixup-loop bounce will tell us which substring is fooling classify() into Fmt\|Lint\|SourceMapCheck on what's actually a Test failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 23:13:38 +01:00
dave	04a57e92c2	huskies: merge 1103 bug Rate-limit warning at session start sticks the `rate_limit_exit` flag, causing 1053's fast-path bypass to skip completion on clean session exits	2026-05-15 21:02:37 +00:00
dave	4216ced493	huskies: merge 1100 bug Multiple LLM agents can run concurrently on the same story (coder + mergemaster + others) — enforce one-agent-per-story invariant	2026-05-15 20:24:31 +00:00
dave	63d86f1263	huskies: merge 1096 bug Shadow drift: set_agent writes CRDT agent register without updating pipeline_items.agent	2026-05-15 19:05:56 +00:00
dave	1adc734801	huskies: merge 1098 bug Shadow drift: set_retry_count / bump_retry_count write CRDT register without updating pipeline_items.retry_count	2026-05-15 18:25:25 +00:00
dave	8531bac6cd	huskies: merge 1097 bug Shadow drift: set_depends_on writes CRDT depends_on register without updating pipeline_items.depends_on	2026-05-15 12:40:17 +00:00
dave	2857c3b46b	huskies: merge 1094 bug delete_story leaks zombie rows in pipeline_items shadow table — 176 tombstoned items still report non-terminal stages	2026-05-15 12:27:48 +00:00
dave	62d1535e76	huskies: merge 1095 bug Shadow drift: set_name writes CRDT name register without updating pipeline_items.name	2026-05-15 12:10:11 +00:00
dave	fc5481dbe4	huskies: merge 1093 bug Chat dispatcher spawns one Timmy per inbound message — needs coalesce window + per-session serial lock	2026-05-15 12:03:09 +00:00
dave	01e60a670c	huskies: merge 1091 refactor Migrate the merge-gate's stale-cargo kill path to `process_kill`	2026-05-15 11:50:03 +00:00
dave	c4010854a5	huskies: merge 1089 bug Stuck-agent detector blocks stories on legitimate exploration / debugging — uses too narrow a "progress" signal	2026-05-15 11:40:44 +00:00
dave	4aa76ce673	huskies: merge 1090 refactor Migrate `AgentPool::kill_all_children` and `kill_child_for_key` to `process_kill` so server shutdown and `stop_agent` actually kill claude	2026-05-15 11:16:16 +00:00
Timmy	fb82bd7bca	test(tick_loop): de-flake reconcile_never_floods_broadcast_channel The test asserted msg_count == 0 on a process-global broadcast channel (TRANSITION_TX is a single OnceLock<Sender> shared across the test binary), so any concurrent test calling apply_transition could land events in our receiver between the drain and the post-reconcile check. Observed failure: 3 stray transitions from parallel tests. Drop the strict count check. The real "never floods" invariant is captured by the Lagged check alone: 1000 seeded items must not overflow the 256-slot channel, which can only hold if the reconcile path bypasses the broadcast (AC4). The sibling test `reconcile_pass_scales_to_1000_items_without_lagged_divergence` already uses this Lagged-only pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 11:13:31 +01:00
Timmy	b7df5cbe4e	fix(agents): kill-then-status reorder in stop_agent stop_agent had the same order-of-operations bug fixed in the watchdog: status flipped to Failed before the claude process was verified gone, opening the idempotency window that allowed a duplicate spawn to race in alongside the surviving process. Now follows the three-step protocol: 1. Read worktree path under a read-only lock (no mutation). 2. SIGKILL the worktree's process tree via process_kill and block until verified gone — start_agent's Running/Pending whitelist continues to reject duplicate spawns throughout. 3. Only then mutate the agent record, abort the task handle, and drop the child_killers entry. Falls back to the old portable_pty SIGHUP path (with a warning) when no worktree was recorded, matching the watchdog's behaviour. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 10:46:02 +01:00
Timmy	fe9804b32c	feat: add process_kill module + use it to fix watchdog double-spawn Adds `crate::process_kill` — reliable SIGKILL-with-verify primitives used across the server in place of the various ad-hoc kill paths that ignored their kill-effective return values. The module exposes three pieces: - `sigkill_pids_and_verify(pids)`: SIGKILL each pid and block (up to 2s) until every pid is verified gone. Returns survivors if not. - `pids_matching(pattern)`: pgrep -f wrapper. - `descendant_pids(root)`: recursive pgrep -P walker for process trees. Wires the watchdog's limit-termination path through it, and reorders the protocol to fix the duplicate-coder bug observed on story 1086 (2026-05-15): Before: check_agent_limits set status=Failed before the kill ran. The kill itself was `portable_pty::ChildKiller::kill()`, which sends SIGHUP on Unix — claude-code ignores SIGHUP, so the process kept running while the agent record was already marked terminated. The idempotency check in `start_agent` whitelists Running/Pending, so the next auto-assign pass spawned a fresh agent alongside the still-alive prior one. Two claude PIDs sharing one session_id, racing on the same worktree. After: status update is moved OUT of check_agent_limits and into the caller AFTER the kill is verified. The kill itself is now SIGKILL-the- process-tree-in-the-worktree, with explicit verification that every pid is gone. The idempotency window is closed. The existing watchdog test suite (14 tests) still passes; 7 new tests cover the process_kill primitives directly. `agents/pool/process.rs`'s `kill_all_children` and `kill_child_for_key` still use the old portable_pty SIGHUP path — they have the same bug but in lower-impact code paths (shutdown, operator stop). They will be migrated under a separate story to keep this commit focused. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 10:36:33 +01:00
dave	df32a1542b	huskies: merge 1087 story Pipeline+Status split — Step D: migrate CRDT storage to (Pipeline, Status) and remove the Stage enum	2026-05-15 08:47:38 +00:00

1 2 3 4 5 ...

1069 Commits