Commit Graph

92 Commits

Author SHA1 Message Date
dave e20083a283 huskies: merge 624_bug_agent_turn_and_budget_limits_not_enforced_coder_1_ran_5_6x_over_max_turns 2026-04-25 13:11:30 +00:00
dave 4b765bbc39 huskies: merge 601_story_project_local_agent_prompt_layer_for_huskies 2026-04-23 11:56:19 +00:00
dave d235fd41ac huskies: merge 581_story_freeze_command_to_hold_a_story_at_its_current_stage_without_advancing 2026-04-15 18:02:14 +00:00
dave df5ba8ebab huskies: merge 560_story_make_merge_agent_work_return_results_like_run_tests_instead_of_polling 2026-04-14 10:26:44 +00:00
dave 979cf39228 huskies: merge 557_refactor_remove_all_filesystem_fallback_paths_crdt_is_the_only_source_of_truth 2026-04-14 09:14:07 +00:00
dave 10d3517648 fix: remove filesystem fallback from scan_stage_items to unblock 557 merge
The auto-resolver kept both sides of the conflict — feature's
_project_root signature with master's filesystem code referencing
project_root — producing a compile error. Remove the filesystem
fallback on master so there's no conflict. CRDT is the only source
of truth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:14:58 +00:00
dave d618bc3b32 huskies: merge 556_bug_stale_filesystem_shadows_in_1_backlog_cause_auto_assign_to_promote_archived_stories 2026-04-13 14:48:44 +00:00
dave 845b85e7a7 fix: add --all to cargo fmt in script/test and autoformat codebase
cargo fmt without --all fails with "Failed to find targets" in
workspace repos. This was blocking every story's gates. Also ran
cargo fmt --all to fix all existing formatting issues.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 14:07:08 +00:00
dave 5806156af3 huskies: merge 553_story_accept_spike_state_machine_transition_skips_merge_and_goes_directly_to_done 2026-04-13 12:54:09 +00:00
dave cec62dad1c huskies: merge 542_refactor_add_doc_comments_to_all_undocumented_source_files_and_generate_source_map_in_readme 2026-04-12 13:16:11 +00:00
dave b4dbfcbde6 huskies: merge 541_story_backlog_command_for_chat_and_web_ui_shows_only_backlog_items 2026-04-12 13:05:12 +00:00
dave 5f01631e6a huskies: merge 543_story_resume_failed_coder_agents_with_resume_instead_of_starting_fresh_sessions 2026-04-12 12:58:42 +00:00
dave c80931c15c fix: add ETXTBSY retry to run_coverage_gate
Use fsync in coverage gate tests to ensure the kernel releases the
write handle before executing the script. Prevents flaky ETXTBSY
errors on fast test runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 12:40:08 +00:00
dave 06defd9596 fix: collapse nested if-let blocks to satisfy clippy collapsible_if lint
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:43:36 +00:00
dave b43e7cf752 fix: kill stale cargo processes before running acceptance gates
The completion handler now pgrep+kills any cargo processes targeting
the worktree's Cargo.toml before running gates. This prevents the
run_tests MCP child from holding the build lock and blocking gates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 00:25:56 +00:00
dave f958f57e56 fix: async run_tests to prevent zombie cargo processes blocking gates
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.

Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:00:05 +00:00
dave e32300d1f8 fix: switch agent permission mode from bypassPermissions to allowFullAutoEdit
bypassPermissions ignored the worktree's .claude/settings.json entirely,
letting agents run any Bash command including cargo test (which they'd
spawn 4+ times concurrently, deadlocking on the build directory lock).

allowFullAutoEdit respects the settings.json allowlist, so agents can
only use the Bash commands we explicitly permit (cargo check, cargo
build, git) and must use MCP tools for everything else (run_tests,
run_lint, run_build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 20:23:22 +00:00
dave d06241c20c fix: merge_agent_work blocks until complete instead of requiring polling
The mergemaster agent was burning all 30 turns polling get_merge_status
every 2 seconds while the merge pipeline takes ~2 minutes. It would
exhaust turns, exit, restart, and repeat — never seeing the result.

merge_agent_work now blocks with a 10-second internal poll loop and
returns the final result directly. The agent calls it once and gets
the answer. No more polling turns wasted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:43:50 +00:00
dave 599fbdc71d huskies: merge 539_bug_crdt_event_bridge_still_writes_filesystem_shadow_files_after_530_eliminated_filesystem_state 2026-04-11 17:04:36 +00:00
dave dcf6cf8f82 fix: collapse consecutive str::replace calls to satisfy clippy 2026-04-11 13:21:47 +00:00
dave 5696d77922 debug: add PTY spawn diagnostics for Session: None investigation
When an agent CLI exits without creating a session, we now log:
- Number of prior sessions and total session log bytes
- Child process exit status (exit code or signal)
- Explicit SESSION NONE warning with context

This will help diagnose whether the fatal runtime error
(output.write assertion) correlates with accumulated sessions,
budget exhaustion, or something else.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:21:06 +00:00
dave bc2b1e244c huskies: merge 498_bug_stale_merge_job_lock_prevents_new_merges_after_agent_dies 2026-04-10 18:55:05 +00:00
dave 91be0ac47f huskies: merge 534_refactor_unify_timer_tick_watchdog_and_watcher_sweep_into_a_single_1_second_tick_loop 2026-04-10 17:38:42 +00:00
dave bfede09fe6 huskies: merge 529_bug_stale_mergemaster_advance_moves_done_stories_back_to_merge_zombie_merge_loop 2026-04-10 15:20:34 +00:00
dave 11d19d8902 huskies: merge 530_story_eliminate_filesystem_markdown_shadows_entirely_crdt_db_is_the_only_story_store 2026-04-10 14:59:58 +00:00
dave 31388da609 huskies: merge 517_story_remove_filesystem_shadow_fallback_paths_from_lifecycle_rs_finish_the_migration_to_crdt_only 2026-04-10 13:00:25 +00:00
dave d1b845fd2e fix: move_item must not overwrite advanced CRDT stage when missing_ok=true (bug 524)
When a story is found in the CRDT but not in the expected source stages,
and missing_ok is true, return Ok(None) instead of proceeding with the move.
This prevents promote_ready_backlog_stories from demoting a story that has
already advanced to merge/done via a stale filesystem shadow in 1_backlog.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 00:21:39 +00:00
dave d3ee850f37 huskies: merge 500_story_remove_duplicate_pty_debug_log_lines 2026-04-09 22:16:03 +00:00
dave cbe016d7a2 huskies: merge 519_story_mergemaster_should_detect_no_commits_ahead_of_master_and_fail_loudly_instead_of_exiting_silently 2026-04-09 22:11:09 +00:00
dave 84717b04bd huskies: merge 520_story_typed_pipeline_state_machine_in_rust_foundation_replaces_stringly_typed_crdt_views_with_strict_enums_subsumes_436 2026-04-09 21:27:48 +00:00
dave 41515e3b8f huskies: merge 503_bug_depends_on_pointing_at_an_archived_story_is_silently_treated_as_deps_met_surprising_users 2026-04-09 18:31:29 +00:00
Timmy 8b2e068d3e fix(502): don't demote merge-stage stories on mergemaster attach
start_agent unconditionally called move_story_to_current at the top of
its body, before the agent-stage check. When called for mergemaster (or
qa) on a story in 4_merge/ AND a stale 1_backlog/ shadow of the story
existed (post-491/492 split-brain artifact), the move would find the
shadow and yank it to 2_current/, find_active_story_stage would then
report 2_current/, the stage check would expect a Coder agent, and
mergemaster would be rejected — leaving the story in 2_current/ to be
re-promoted by the next auto-assign tick. Infinite loop.

Gate the move so it only fires for Coder-stage agents. QA and
Mergemaster now attach to the story at its existing stage.

Adds a regression test that reproduces the split-brain scenario by
seeding both 4_merge/ and 1_backlog/ copies of the same story and
asserting (1) the stage check does not reject mergemaster, and (2) the
4_merge/ copy is preserved (i.e. not demoted to 2_current/).

Observed live on 2026-04-09 while story 478 was looping. Filed as
bug 502.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:18:01 +01:00
dave 8fd49d563e huskies: merge 492_story_remove_filesystem_pipeline_state_and_store_story_content_in_database 2026-04-08 03:07:33 +00:00
dave eba933e21e huskies: merge 497_bug_dependency_promotion_loop_missing_stories_with_met_deps_never_move_from_backlog_to_current 2026-04-08 01:32:26 +00:00
dave 5c2769dd7d huskies: merge 491_story_watcher_fires_on_crdt_state_transitions_instead_of_filesystem_events 2026-04-08 01:18:30 +00:00
dave dea410149a huskies: merge 496_bug_hard_rate_limit_without_reset_at_never_auto_schedules_retry 2026-04-08 00:04:25 +00:00
dave 753f7f1c92 fix: comment out premature db::crdt references that broke build
The 490 merge introduced references to a db::crdt module that doesn't
exist yet (it's part of story 491). Commented out with TODO(491)
markers so master compiles. The crdt_state.rs module from 490 is
intact — these are just the call sites that will be wired up when
491 lands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:49:11 +00:00
dave 15a52d6d38 ignore kleppmann_trace test — 10+ min, 12GB RAM
Marked #[ignore] so cargo test skips it by default. Run manually with
--ignored flag when needed for benchmarking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:15:38 +00:00
dave 7eecfeb56a bump gate timeout from 600s to 1200s
Merge worktree cold-compiles the BFT CRDT crate + all deps which
exceeds 600s. 1200s gives enough headroom.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:47:44 +00:00
dave f1ef31d1ee huskies: merge 489_story_sqlite_shadow_write_for_pipeline_state_via_sqlx 2026-04-07 13:13:17 +00:00
dave 5413a26406 huskies: merge 484_story_story_dependencies_in_pipeline_auto_assign 2026-04-04 21:46:58 +00:00
dave 91d31d908f huskies: merge 476_refactor_split_agents_pool_lifecycle_rs_into_submodules 2026-04-04 20:54:24 +00:00
dave eb8654dba0 huskies: merge 475_refactor_deduplicate_lifecycle_rs_move_functions_into_a_shared_parameterised_helper 2026-04-04 15:23:49 +00:00
Timmy 2d8ccb3eb6 huskies: rename project from storkit to huskies
Rename all references from storkit to huskies across the codebase:
- .storkit/ directory → .huskies/
- Binary name, Cargo package name, Docker image references
- Server code, frontend code, config files, scripts
- Fix script/test to build frontend before cargo clippy/test
  so merge worktrees have frontend/dist available for RustEmbed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:12:52 +01:00
dave 641384e794 storkit: merge 462_bug_stage_transition_notifications_can_arrive_out_of_order_and_show_wrong_story_name 2026-04-03 12:04:58 +00:00
dave f16545ec36 fix: join PTY reader thread before returning to prevent stale fd leak (#453)
The reader thread spawned in run_agent_pty_blocking was never joined,
leaving a cloned PTY master fd open after the agent exited. When the
pipeline restarted the agent on the same worktree, the stale fd from
the previous session interfered with the new PTY allocation, causing
Claude Code's bundled ripgrep to crash with:
  fatal runtime error: assertion failed: output.write(&bytes).is_ok()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 14:41:00 +00:00
dave 3048d26e66 storkit: merge 445_bug_rate_limited_mergemaster_exits_advance_stories_to_done_without_merging 2026-03-28 20:08:15 +00:00
dave 5dcc35a1b3 fix: gate runner delegates to script/test instead of hardcoding cargo clippy
The acceptance gate was hardcoded to run cargo clippy, which fails on
non-Rust projects (Go, Node, etc.). Now the gate only runs script/test
which is project-specific. Clippy is added to storkit's own script/test
so Rust linting is preserved for this project.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:15:29 +00:00
dave 361f9dff0d fix(426): also narrow pre-cherry-pick code change check to .storkit/work/
There were two places checking for code changes: the post-cherry-pick
verification (already fixed) and a pre-cherry-pick check in the
merge-queue worktree. The pre-cherry-pick check was still filtering
all of .storkit/ which rejected stories that only change project.toml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:27:19 +00:00
dave 93576e3f83 fix(426): narrow merge verification exclude to .storkit/work/ only
The post-cherry-pick diff check was excluding all of .storkit/, which
rejected stories whose deliverable is .storkit/project.toml changes
(e.g. 431 updating QA agent prompts). Narrow the exclusion to
.storkit/work/ which is where pipeline file moves live.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:21:57 +00:00