huskies

Author	SHA1	Message	Date
dave	0b3a33a63c	huskies: merge 1037	2026-05-14 15:54:17 +00:00
Timmy	b0090aba84	Adding baseline source-map	2026-05-14 16:35:08 +01:00
dave	14a39b6205	huskies: merge 980	2026-05-13 14:44:17 +00:00
dave	c89a5c2da6	huskies: merge 966	2026-05-13 12:21:43 +00:00
dave	3c9851d17d	docs(AGENT.md): forceful "no exceptions" doc-comment rule Two stories today (961, 962) passed every other gate and got bounced at the merge step on a single missing `///` on a `pub mod` line. Sonnet keeps treating the doc comment as optional when the rule says "add doc comments to new modules and pub functions/structs/enums." Promote the rule to its own loud section with no-exceptions wording and a concrete reminder to run source-map-check before committing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 12:08:54 +00:00
Timmy	78b1ecdc3c	docs(AGENT): require PLAN.md update on every wip + final commit The "living document" rule was soft and got ignored — coders wrote PLAN.md once at session start and then drifted away from it. Tie the update to a trigger they already do (the wip/final commit), and call out stale "Current state" as a process failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 11:57:51 +01:00
dave	4a0fbcaa95	huskies: merge 949	2026-05-13 07:14:50 +00:00
dave	9ccbdff19f	huskies: merge 952	2026-05-13 05:43:22 +00:00
dave	8e9112066f	huskies: merge 935	2026-05-12 22:03:15 +00:00
dave	c3144b7937	huskies: merge 900	2026-05-12 16:46:33 +00:00
Timmy	6feb68f3e3	fix(923): watchdog counts only tool-using turns; narration-only turns no longer burn budget Observed: stories 917, 918, 920, 910 all turn-limit-killed despite producing real commits. Tally across their session logs shows 30–55% of assistant turns were pure narration ("I'll read X next", "Now let me check Y") with no tool_use. At 80 max_turns the effective work budget was ~44 tool calls, not enough for a typical bug fix's edit + test + check_criterion cycle. Changes: - New optional AgentConfig field max_tool_turns. When set the watchdog uses it instead of max_turns; only assistant messages whose data.message.content has at least one tool_use block count. - count_turns_in_log in agents/pool/auto_assign/watchdog/limits.rs filters on tool_use. Existing test helper write_fake_session_log now emits tool_use blocks; added write_fake_mixed_session_log for the narration regression test. - agents.toml: coders/coder-opus get max_turns=200 (claude-code's own --max-turns cap, sized to never bite before the watchdog) and max_tool_turns=80. qa: 120 / 40. mergemaster: 250 / 100. Budgets unchanged — the dollar cap remains the runaway-loop backstop, with ~$3-5 worst-case waste if an agent narrates indefinitely. - Two new regression tests: * watchdog_does_not_count_narration_only_turns: 5 tool + 30 narration under max_tool_turns=10 stays Running. * watchdog_max_tool_turns_overrides_max_turns: 4 tool turns at max_tool_turns=3 / max_turns=200 still terminates with TurnLimit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 17:25:11 +01:00
Timmy	bb845d17cf	docs(904): drop run_tests retry-on-timeout clause from coder prompts Bug 903 (run_tests attach instead of respawn) + 904 (MCP progress notifications + SSE) together eliminate the transport-timeout error mode from the agent's point of view: long test runs complete without the MCP client ever observing a tool-call error. Production verification (see `d64f1e94` / `ddc4228b` deploy at 14:30 UTC today) confirmed 78s and 65s test runs completing in single processes with no respawn churn and no retry needed. The "If run_tests errors with a transport timeout, call it again" sentence in coder-1/2/3/opus system_prompts (added belt-and-braces in `a97a10fb`) is now redundant. Removing it tightens the agent's mental model down to: call run_tests, wait for the result. No error-handling branch, no retry semantics to internalise. This closes the last open AC on story 904. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 15:36:53 +01:00
Timmy	a97a10fba2	docs(903): coder system_prompts — clarify run_tests retry contract Pre-d64f1e94 the "call run_tests again — it attaches" guidance was a lie (every call killed the prior job and spawned a fresh one). With the attach fix in place, the contract is now real and safe to depend on. Tighten the wording so agents see exactly what to do: OLD: "Do not use ScheduleWakeup to wait for run_tests; if run_tests appears to time out, call run_tests again — it attaches to the in-flight test job and blocks until completion." NEW: "If run_tests errors with a transport timeout, call it again — it's idempotent and attaches to the same in-flight test job, so retries are safe and eventually return a pass/fail result." Improvements: - "errors with a transport timeout" matches what the agent literally observes (a tool-call error), not the vague "appears to time out". - Explicit on idempotency so agents understand why retry is safe and don't worry about double-running the suite. - Drops the ScheduleWakeup clause — already enforced via the `disallowed_tools` setting on coder-1/2/3/opus, so the prompt reminder was redundant. Applied uniformly across coder-1, coder-2, coder-3, coder-opus. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 14:54:34 +01:00
Timmy	e955250474	fix(902): coder system_prompts steer to get_story_todos for story content Bug 902: the Step 0 "resume from worktree state" instruction told coders to call git_status / git_log / git_diff to discover prior session work, which they then extended into hunting for the story `.md` file on disk via find / ls — pointless post-865, since story content lives only in the CRDT. Update Step 0 in coder-1, coder-2, coder-3, and coder-opus to add an explicit instruction: "To read story content, ACs, or description, call the `get_story_todos` MCP tool — do NOT search for a story `.md` file on disk; story content is CRDT-only." Single substring replacement covers all four agents (identical Step 0 across them). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 13:13:08 +01:00
dave	fac4442969	fix(896): disallow ScheduleWakeup for coder agents; add run_tests retry guidance - Add `disallowed_tools` field to `AgentConfig` and render it as `--disallowedTools` CLI flag in `render_agent_args` - Set `disallowed_tools = ["ScheduleWakeup"]` on all four coder agents (coder-1, coder-2, coder-3, coder-opus); QA and mergemaster unaffected - Append instruction to all coder `system_prompt`s: do not use ScheduleWakeup to wait for run_tests; if run_tests appears to time out, call run_tests again — it attaches to the in-flight job and blocks - Add tests: `render_agent_args_disallowed_tools` and `coder_agents_disallow_schedule_wakeup`	2026-05-08 15:28:48 +01:00
dave	c50a04445c	spike(814): add gateway update command design doc Documents chat-driven `update` bot command for multi-project gateway: command surface, auth (room+role guard, future Ed25519), Docker-managed rollout sequence, automatic and manual rollback, open questions, and dependencies.	2026-04-29 18:17:19 +00:00
dave	cf35027b5a	config(coders): step 0 — resume prior-session work via git_status + git_log/diff against master..HEAD	2026-04-29 16:03:03 +00:00
dave	b4854cf693	huskies: merge 862	2026-04-29 13:28:37 +00:00
dave	9979ff2cf9	huskies: merge 859	2026-04-29 10:18:37 +00:00
dave	8802e1fe59	huskies: merge 853	2026-04-29 09:08:28 +00:00
dave	549a9defc4	huskies: merge 851	2026-04-29 08:42:28 +00:00
dave	3ce34c34e9	huskies: merge 850	2026-04-29 08:27:05 +00:00
dave	b698cee284	huskies: merge 821	2026-04-28 21:06:54 +00:00
dave	32a3465fc4	fix: tell the truth about run_tests being blocking `tool_run_tests` in `server/src/http/mcp/shell_tools/script.rs` is fully blocking server-side: it spawns the test child, polls every 1s server-side until exit (or `TEST_TIMEOUT_SECS = 1200s`), and returns the full {passed, exit_code, output} directly. There is NO async/started-status return path. But two places told agents the wrong story: 1. `tools_list/system_tools.rs` description claimed "Returns immediately with status: started. Poll get_test_result..." — agents read tool descriptions for protocol semantics, so they followed this and burned turns polling get_test_result. 2. `agents.toml` had been correctly saying it blocks, but my last commit (`776aad38`) "fixed" it the wrong way based on a misread of the code. Now both say: run_tests blocks server-side, returns the full result, do not poll get_test_result. get_test_result remains for external observers (UI checking on a job another caller started). Reverts the prompt change in `776aad38` with the correct text.	2026-04-28 15:59:06 +00:00
dave	776aad3877	fix: agent prompts honest about run_tests being async Pre-f958f57e, run_tests blocked until completion. After that fix it became a background-job starter, with get_test_result polling. The agent prompts were never updated, so they still said "run_tests blocks until complete" — and agents then waste turns polling. Updated coder-1/2/3, coder-opus, and qa prompts to describe the actual flow: run_tests is async, get_test_result blocks for up to 20s per call, test suites typically take 1-5 minutes so expect a few polls. Companion bug filed for bumping TEST_POLL_BLOCK_SECS so one poll covers most test runs (root-cause fix; this commit is the prompt half).	2026-04-28 15:55:15 +00:00
dave	bb779a0b0f	chore: regenerate STACK.md source map from module doc-comments Walked server/src/, frontend/src/, and crates/, extracting each module's //! doc-comment to build a directory-level source map. One row per directory + one row per top-level file. Replaces the hand-written stopgap from `5d6757dd` with content auto-derived from the codebase, so it stays useful as decomposes happen — the descriptions come from mod.rs, not from my recollection of where things live. Still a stopgap until 819 (auto-generated source-map-gen) lands and gets wired into the agent spawn path, but the content is closer to what 819 will produce.	2026-04-28 15:50:29 +00:00
dave	5d6757dd65	chore: restore source map in STACK.md (stopgap for 818 regression) 818 stripped the source map because it had stale paths. Empirically that made coder agents far slower — they spent most of each session re-discovering the codebase via Read/Grep before reaching any Edit, and ran out of turn budget without committing. Restoring a fresh source map keyed off current master. Uses directories where possible so it stays useful through future decomposes, plus a "Canonical examples" section pointing at the patterns to copy when adding new CRDT collections, RPC handlers, services, chat commands, etc. This is a stopgap until 819 (auto-generated source-map-gen) lands.	2026-04-28 15:43:44 +00:00
dave	36ca8d5e3b	huskies: merge 827	2026-04-28 13:01:48 +00:00
dave	e879d6f602	huskies: merge 818	2026-04-28 12:03:01 +00:00
dave	c1bb5888a8	config: bump mergemaster max_turns 60->100, budget $15->$25 Mergemaster needs more headroom for heavy merges (e.g. the slug-to-numeric ID migration touching many files, or the FS-shadow deletion stories that require fixing test setup across the codebase). 60 turns wasn't enough for the larger ones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 21:20:22 +00:00
dave	191883fe2a	config: brutalist refactor guidance + bump mergemaster inactivity_timeout - Append to all coder/opus system_prompts: for delete/signature-change refactors, delete first and let compiler errors guide the call-site walk; do not pre-read files predicting breakage. Reduces exploration overhead on mechanical refactors. - Bump mergemaster inactivity_timeout_secs 300 -> 900 (15 min) so mergemaster survives the 5-minute API rate-limit backoff. Without this, mergemaster gets killed for inactivity while waiting on rate limit clear, blocking all merges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 21:19:09 +00:00
dave	2b28ccbf2c	Merge spike branch 'feature/story-679_spike_migrate_inter_component_http_to_signed_crdt_websocket_bus' into master	2026-04-27 17:01:48 +00:00
dave	5884dac825	chore: gitignore .huskies/session_store.json (runtime artifact) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 14:59:33 +00:00
dave	0b7f7dfdf7	config: bump sonnet coder-1/2/3 max_turns 50→80 Stories like the broadcaster-consumer migrations legitimately need ~60 substantive turns (16 ProjectConfig initializer sites + main.rs subscriber + reading existing patterns to mirror). 50 was too tight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 14:56:24 +00:00
dave	756c790b9f	spike 679: document HTTP-to-CRDT-bus migration plan Full inventory of all gateway and project server endpoints with caller, purpose, latency/freshness/durability requirements. Classifies each as write/read/external-webhook/frontend-asset. Maps write endpoints to target CRDT collections, proposes RPC frame shapes for read endpoints, drafts the unsigned read-RPC protocol (envelope, correlation IDs, TTL, error codes, peer-offline handling), lists in-memory state needing CRDT migration with proposed types, and defines a wave-ordered migration plan with explicit dependencies (story 665 Ed25519 auth as the blocker for write migrations).	2026-04-27 14:49:38 +00:00
dave	56c979c950	config: tell mergemaster to use 5-min sleeps between merge_agent_work polls Real cause of mergemaster turn-burnout: not merge conflicts, just polling overhead. The server-side tool_merge_agent_work IS designed to block until the merge completes, but the MCP client times out after 60s. The agent then polls get_merge_status, with 30-60s sleeps between polls — each poll cycle costs 2 turns (sleep + tool call). The merge takes 5-10 min for a clean run, so the agent burns 10-20 turns just waiting. Updated workflow tells mergemaster: - 'operation timed out' is normal, do NOT immediately re-call (would queue a duplicate merge) - Use Bash sleep 300 (one 5-min wait = 1 turn) between polls - Cap at 3 polls = 15 minutes total, plenty for any clean merge - Reserve turns for actual fix-up work if gates fail Combined with the earlier 30→60 turn / $5→$15 budget bump, this should land any merge with no real conflicts in 3-5 turns total. Plenty of headroom remaining for genuine gate-fix work.	2026-04-27 10:50:44 +00:00
dave	7b305ba892	config: bump mergemaster max_turns 30→60, budget $5→$15 30 turns is too tight for non-trivial merge gate failures. Combined with the 3-retry cap, stories with any post-merge fix-up needed (cargo fmt nits, slightly out-of-date diffs after parallel merges, etc.) get permanently blocked. This is a stopgap until story 668 lands (which will keep gates_passed=false work in the coder stage entirely, so mergemaster only ever sees clean diffs and the original 30 turns / $5 is fine again).	2026-04-27 10:41:45 +00:00
dave	9fbbfcd585	huskies: merge 667_story_agent_prompt_target_maximum_file_size_of_800_lines_as_a_soft_guide_decompose_larger_files_by_concern	2026-04-27 01:37:52 +00:00
dave	f88bb5f486	huskies: merge 645_bug_agent_runtime_panics_with_output_write_bytes_is_ok_assertion_marking_stories_falsely_blocked	2026-04-26 10:54:58 +00:00
dave	2097787e1f	docs: add pipeline state machine reference (current + planned transitions) Captures the dual representation we have today (legacy filesystem stage strings + front-matter flags vs the typed Stage/ArchiveReason/ExecutionState enums in pipeline_state.rs that are defined-but-not-wired) and itemises the transitions and behaviours we have identified as missing or partially implemented (first-class supersede/abandon/hold verbs, type-conversion side effects, pinned-agent honouring under contention, blocked-flag enforcement beyond auto-assign, ghost-story recovery, etc.). Section (b) is intended as a living dumping ground — append new transitions and incidents as they come up so that the state-machine roadmap (spike 613 in backlog) has a ready-made input.	2026-04-25 13:33:57 +00:00
dave	4b765bbc39	huskies: merge 601_story_project_local_agent_prompt_layer_for_huskies	2026-04-23 11:56:19 +00:00
dave	b3da321a3b	huskies: merge 598_story_expose_huskies_init_as_a_gateway_mcp_tool	2026-04-22 21:39:29 +00:00
Timmy	6c76b569c4	Deleting ancient handoff file	2026-04-17 13:18:02 +01:00
dave	a4480fa067	chore: feed CONTEXT and STACK specs to all agents, update STACK with source map Agents now read specs/00_CONTEXT.md (what the project does) and specs/tech/STACK.md (tech stack + source map) in addition to the README. STACK.md rewritten to reflect current state — removes stale references to biome, tauri-specta, .story_kit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 18:15:09 +00:00
dave	483489cc44	fix: rewrite coder agent prompts — run tests before commit, remove stale instructions Key changes: - Tests before commit, not after: "run run_tests, fix failures, then commit" - Removed polling references (run_tests blocks now) - Removed "never run script/test" (primes agents to think about it) - Removed dead "user review" instruction - Removed "commit and stop" which signalled skip-testing - Cleaner workflow: implement → check criteria → test → fix → commit → exit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 13:19:08 +00:00
dave	8936abd8cd	docs: add project architecture section to README for agent context Agents need to know the gateway is a mode of the binary, not a separate app, and that UI stories are frontend React work, not Rust backend restructuring. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:23:18 +00:00
dave	28adef9739	chore: switch mergemaster to opus and add cargo fmt guidance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:35:57 +00:00
dave	badfabcf5e	chore: switch mergemaster to opus and add cargo fmt guidance Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:27:58 +00:00
dave	845b85e7a7	fix: add --all to cargo fmt in script/test and autoformat codebase cargo fmt without --all fails with "Failed to find targets" in workspace repos. This was blocking every story's gates. Also ran cargo fmt --all to fix all existing formatting issues. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 14:07:08 +00:00
dave	0cb68e1de9	docs: add deployment modes to README — standard, headless, and gateway Documents the three modes of the huskies binary: standard single-project server, headless build agent (--rendezvous), and multi-project gateway (--gateway). Includes projects.toml config example and Docker Compose sketch for multi-project setup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 13:44:10 +00:00

1 2 3 4

194 Commits