Commit Graph

27 Commits

Author SHA1 Message Date
dave cf35027b5a config(coders): step 0 — resume prior-session work via git_status + git_log/diff against master..HEAD 2026-04-29 16:03:03 +00:00
dave b4854cf693 huskies: merge 862 2026-04-29 13:28:37 +00:00
dave 9979ff2cf9 huskies: merge 859 2026-04-29 10:18:37 +00:00
dave 8802e1fe59 huskies: merge 853 2026-04-29 09:08:28 +00:00
dave 549a9defc4 huskies: merge 851 2026-04-29 08:42:28 +00:00
dave 3ce34c34e9 huskies: merge 850 2026-04-29 08:27:05 +00:00
dave b698cee284 huskies: merge 821 2026-04-28 21:06:54 +00:00
dave 32a3465fc4 fix: tell the truth about run_tests being blocking
`tool_run_tests` in `server/src/http/mcp/shell_tools/script.rs` is fully
blocking server-side: it spawns the test child, polls every 1s server-side
until exit (or `TEST_TIMEOUT_SECS = 1200s`), and returns the full
{passed, exit_code, output} directly. There is NO async/started-status
return path.

But two places told agents the wrong story:
1. `tools_list/system_tools.rs` description claimed "Returns immediately
   with status: started. Poll get_test_result..." — agents read tool
   descriptions for protocol semantics, so they followed this and burned
   turns polling get_test_result.
2. `agents.toml` had been correctly saying it blocks, but my last commit
   (776aad38) "fixed" it the wrong way based on a misread of the code.

Now both say: run_tests blocks server-side, returns the full result, do
not poll get_test_result. get_test_result remains for external observers
(UI checking on a job another caller started).

Reverts the prompt change in 776aad38 with the correct text.
2026-04-28 15:59:06 +00:00
dave 776aad3877 fix: agent prompts honest about run_tests being async
Pre-f958f57e, run_tests blocked until completion. After that fix it became
a background-job starter, with get_test_result polling. The agent prompts
were never updated, so they still said "run_tests blocks until complete"
— and agents then waste turns polling.

Updated coder-1/2/3, coder-opus, and qa prompts to describe the actual
flow: run_tests is async, get_test_result blocks for up to 20s per call,
test suites typically take 1-5 minutes so expect a few polls.

Companion bug filed for bumping TEST_POLL_BLOCK_SECS so one poll covers
most test runs (root-cause fix; this commit is the prompt half).
2026-04-28 15:55:15 +00:00
dave 36ca8d5e3b huskies: merge 827 2026-04-28 13:01:48 +00:00
dave c1bb5888a8 config: bump mergemaster max_turns 60->100, budget $15->$25
Mergemaster needs more headroom for heavy merges (e.g. the slug-to-numeric
ID migration touching many files, or the FS-shadow deletion stories that
require fixing test setup across the codebase). 60 turns wasn't enough
for the larger ones.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:20:22 +00:00
dave 191883fe2a config: brutalist refactor guidance + bump mergemaster inactivity_timeout
- Append to all coder/opus system_prompts: for delete/signature-change
  refactors, delete first and let compiler errors guide the call-site
  walk; do not pre-read files predicting breakage. Reduces exploration
  overhead on mechanical refactors.

- Bump mergemaster inactivity_timeout_secs 300 -> 900 (15 min) so
  mergemaster survives the 5-minute API rate-limit backoff. Without
  this, mergemaster gets killed for inactivity while waiting on rate
  limit clear, blocking all merges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:19:09 +00:00
dave 0b7f7dfdf7 config: bump sonnet coder-1/2/3 max_turns 50→80
Stories like the broadcaster-consumer migrations legitimately need ~60
substantive turns (16 ProjectConfig initializer sites + main.rs subscriber
+ reading existing patterns to mirror). 50 was too tight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:56:24 +00:00
dave 56c979c950 config: tell mergemaster to use 5-min sleeps between merge_agent_work polls
Real cause of mergemaster turn-burnout: not merge conflicts, just polling
overhead. The server-side tool_merge_agent_work IS designed to block until
the merge completes, but the MCP client times out after 60s. The agent
then polls get_merge_status, with 30-60s sleeps between polls — each
poll cycle costs 2 turns (sleep + tool call). The merge takes 5-10 min
for a clean run, so the agent burns 10-20 turns just waiting.

Updated workflow tells mergemaster:
- 'operation timed out' is normal, do NOT immediately re-call (would queue
  a duplicate merge)
- Use Bash sleep 300 (one 5-min wait = 1 turn) between polls
- Cap at 3 polls = 15 minutes total, plenty for any clean merge
- Reserve turns for actual fix-up work if gates fail

Combined with the earlier 30→60 turn / $5→$15 budget bump, this should
land any merge with no real conflicts in 3-5 turns total. Plenty of
headroom remaining for genuine gate-fix work.
2026-04-27 10:50:44 +00:00
dave 7b305ba892 config: bump mergemaster max_turns 30→60, budget $5→$15
30 turns is too tight for non-trivial merge gate failures. Combined with
the 3-retry cap, stories with any post-merge fix-up needed (cargo fmt
nits, slightly out-of-date diffs after parallel merges, etc.) get
permanently blocked.

This is a stopgap until story 668 lands (which will keep gates_passed=false
work in the coder stage entirely, so mergemaster only ever sees clean
diffs and the original 30 turns / $5 is fine again).
2026-04-27 10:41:45 +00:00
dave a4480fa067 chore: feed CONTEXT and STACK specs to all agents, update STACK with source map
Agents now read specs/00_CONTEXT.md (what the project does) and
specs/tech/STACK.md (tech stack + source map) in addition to the
README. STACK.md rewritten to reflect current state — removes stale
references to biome, tauri-specta, .story_kit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 18:15:09 +00:00
dave 483489cc44 fix: rewrite coder agent prompts — run tests before commit, remove stale instructions
Key changes:
- Tests before commit, not after: "run run_tests, fix failures, then commit"
- Removed polling references (run_tests blocks now)
- Removed "never run script/test" (primes agents to think about it)
- Removed dead "user review" instruction
- Removed "commit and stop" which signalled skip-testing
- Cleaner workflow: implement → check criteria → test → fix → commit → exit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 13:19:08 +00:00
dave 28adef9739 chore: switch mergemaster to opus and add cargo fmt guidance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:35:57 +00:00
dave badfabcf5e chore: switch mergemaster to opus and add cargo fmt guidance
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:27:58 +00:00
dave b7f077197d chore: add doc comment guidance to coder agent system prompts
Agents now know to add //! module comments and /// doc comments
to new public items, keeping documentation consistent with the
codebase-wide doc pass from story 542.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:25:21 +00:00
dave f958f57e56 fix: async run_tests to prevent zombie cargo processes blocking gates
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.

Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:00:05 +00:00
dave c0d1be675b fix: mergemaster prompt says merge_agent_work blocks — no polling needed 2026-04-11 18:13:53 +00:00
dave a9a1852422 fix: agent prompts say trust the story description instead of always investigating
Agents were spending entire $5 budgets grepping the codebase and reading
git history instead of making fixes when the story already specifies
exact file paths and function names. Changed bug workflow from
"investigate root cause first" to "trust the story, act fast" — go
directly to the specified location when the story tells you where.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 15:16:12 +00:00
dave dd53870c59 fix: agent prompts use run_tests MCP tool instead of running script/test via Bash
Agents were running script/test directly through the PTY, streaming
the full output of npm install, cargo clippy, cargo test, and frontend
builds into session logs. This tripled session log sizes (~200KB to
~600KB per session) and contributed to CLI SIGABRT crashes.

The run_tests MCP tool already runs script/test server-side and returns
a truncated JSON summary. Agents now use it exclusively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:46:05 +00:00
dave 7e5b9839e8 huskies: merge 523_refactor_introduce_script_test_script_lint_script_build_and_migrate_agent_prompts_off_tech_specific_commands 2026-04-10 11:22:51 +00:00
Timmy 934bda5904 Trying out sonnet for merges 2026-04-10 01:04:25 +01:00
dave 470e7a5fd5 huskies: merge 482_refactor_split_agent_definitions_from_project_toml_into_agents_toml 2026-04-04 21:24:22 +00:00