Commit Graph

3378 Commits

Author SHA1 Message Date
dave 615e1c7f73 huskies: merge 738_refactor_delete_fs_shadow_code_from_lifecycle_rs_and_the_work_directory_watcher 2026-04-27 19:56:53 +00:00
dave 63a30a9319 huskies: merge 736_story_drain_and_prepend_buffered_status_events_on_the_user_s_next_agent_message 2026-04-27 19:37:39 +00:00
dave aa7b26a24a huskies: merge 728_story_cryptographic_peer_handshake_with_trusted_keys_gating 2026-04-27 19:22:55 +00:00
dave ded8c6fd66 huskies: merge 685_refactor_decompose_server_src_config_rs_1223_lines 2026-04-27 19:16:25 +00:00
dave 26f9f3f7fc huskies: merge 729_story_store_story_name_as_a_crdt_field_separate_from_the_story_id 2026-04-27 19:09:56 +00:00
dave 4aadf4aa47 huskies: merge 684_refactor_decompose_server_src_http_agents_rs_1249_lines 2026-04-27 18:49:53 +00:00
dave 4b64bc614f huskies: merge 726_story_notify_chat_transports_when_oauth_account_swaps_or_all_accounts_are_exhausted 2026-04-27 18:44:47 +00:00
dave 80661fa622 huskies: merge 727_story_ed25519_node_identity_keypair_generation_persistence_and_identity_endpoint 2026-04-27 18:37:58 +00:00
dave b008235d0d huskies: merge 683_refactor_decompose_server_src_agents_pool_start_mod_rs_1329_lines 2026-04-27 18:26:31 +00:00
dave 646dc490b8 huskies: merge 720_refactor_add_mesh_status_mcp_tool_read_only_peer_mesh_diagnostics 2026-04-27 18:18:51 +00:00
dave c6bc6f07f7 huskies: merge 725_story_auto_swap_oauth_account_on_rate_limit 2026-04-27 18:12:57 +00:00
dave 272a592a4d huskies: merge 735_story_attach_statuseventbuffer_to_each_agent_session_scoped_per_project_reset_on_restart 2026-04-27 18:06:11 +00:00
dave d654f55981 huskies: merge 682_refactor_decompose_server_src_agents_merge_squash_rs_1346_lines 2026-04-27 17:58:04 +00:00
dave 101f616346 huskies: merge 719_refactor_stale_merge_job_lock_recovery_on_new_merge_attempts 2026-04-27 17:46:49 +00:00
dave 1ecb4dad55 huskies: merge 724_story_per_account_oauth_credential_storage_with_login_pool 2026-04-27 17:40:53 +00:00
dave ed8646f0d9 huskies: merge 681_refactor_decompose_server_src_agents_pool_pipeline_advance_mod_rs_1509_lines 2026-04-27 17:35:17 +00:00
dave 875096b3ec huskies: merge 718_refactor_stale_agent_claims_time_out_claim_ttl_with_displacement 2026-04-27 17:26:51 +00:00
dave 77081926d1 huskies: merge 715_refactor_decompose_frontend_src_components_workitemdetailpanel_tsx_827_lines 2026-04-27 17:08:26 +00:00
dave fce7e16811 huskies: merge 716_story_statuseventbuffer_bounded_per_instance_buffer_over_services_status_broadcaster 2026-04-27 17:03:12 +00:00
dave 2b28ccbf2c Merge spike branch 'feature/story-679_spike_migrate_inter_component_http_to_signed_crdt_websocket_bus' into master 2026-04-27 17:01:48 +00:00
dave 4a0f57478c huskies: merge 671_refactor_migrate_pipeline_state_consumers_from_string_comparisons_to_typed_pipelinestage_enum 2026-04-27 16:39:39 +00:00
dave 39a9766d7d huskies: merge 677_refactor_reject_promotion_to_current_coder_of_work_items_with_junk_only_acceptance_criteria 2026-04-27 16:30:35 +00:00
dave 5884dac825 chore: gitignore .huskies/session_store.json (runtime artifact)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:59:33 +00:00
dave 0b7f7dfdf7 config: bump sonnet coder-1/2/3 max_turns 50→80
Stories like the broadcaster-consumer migrations legitimately need ~60
substantive turns (16 ProjectConfig initializer sites + main.rs subscriber
+ reading existing patterns to mirror). 50 was too tight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:56:24 +00:00
dave 756c790b9f spike 679: document HTTP-to-CRDT-bus migration plan
Full inventory of all gateway and project server endpoints with caller,
purpose, latency/freshness/durability requirements. Classifies each as
write/read/external-webhook/frontend-asset. Maps write endpoints to
target CRDT collections, proposes RPC frame shapes for read endpoints,
drafts the unsigned read-RPC protocol (envelope, correlation IDs, TTL,
error codes, peer-offline handling), lists in-memory state needing CRDT
migration with proposed types, and defines a wave-ordered migration plan
with explicit dependencies (story 665 Ed25519 auth as the blocker for
write migrations).
2026-04-27 14:49:38 +00:00
dave 6a582d73b6 huskies: merge 675_bug_mergemaster_silently_exits_when_feature_branch_has_zero_commits_ahead_of_master 2026-04-27 14:43:54 +00:00
dave ea872fa01c huskies: merge 676_bug_apply_and_persist_silently_drops_ops_when_persist_channel_send_fails 2026-04-27 14:38:11 +00:00
dave cbb0a50729 huskies: merge 649_story_migrate_whatsapp_transport_to_status_broadcaster 2026-04-27 14:19:19 +00:00
dave 6c8043d866 huskies: merge 648_story_migrate_discord_transport_to_status_broadcaster 2026-04-27 14:01:32 +00:00
dave 9040d18f50 huskies: merge 664_story_crdt_lamport_clock_inner_seq_must_resume_from_max_own_author_seq_1_instead_of_resetting_to_1_on_restart_phase_c 2026-04-27 12:30:44 +00:00
dave 25603bb8cb huskies: merge 669_story_migrate_slack_transport_to_status_broadcaster 2026-04-27 11:57:06 +00:00
dave 5da29c3d91 huskies: merge 668_bug_pipeline_advances_coder_work_to_merge_when_gates_passed_false 2026-04-27 11:39:11 +00:00
dave 65d2fb210c huskies: merge 655_bug_matrix_bot_spawns_its_own_timerstore_instead_of_using_shared_appcontext_timer_store 2026-04-27 11:32:51 +00:00
dave ac85cfce5d huskies: merge 652_story_pass_resume_session_id_on_agent_respawn_so_new_sessions_inherit_prior_reasoning 2026-04-27 11:27:50 +00:00
dave 144f07f412 huskies: merge 644_story_chat_transport_consumers_slack_discord_whatsapp_matrix_for_the_unified_status_broadcaster 2026-04-27 11:22:52 +00:00
dave 75533225e4 fix: commit minor fmt residue blocking mergemaster cherry-picks
Master had 8 uncommitted single-line whitespace changes (blank-line trimming
in test mod headers, etc.) left over from a previous mergemaster cargo-fmt
run that didn't get committed. Each subsequent merge attempt hit:

  cherry-pick failed: 'Your local changes to the following files would be
  overwritten by merge. Please commit your changes or stash them.'

So merges had been silently un-mergeable for the last several rounds —
mergemaster correctly reported the issue but had no way to fix master's
own state from inside the merge_workspace.

Files affected (all whitespace-only):
- chat/transport/matrix/bot/messages/{handle_message,on_room_message}.rs
- chat/transport/slack/commands/{llm,mod}.rs
- http/mcp/agent_tools/worktree.rs
- http/workflow/story_ops/{create,criterion,update}.rs

cargo clippy --all-targets -- -D warnings: clean
cargo fmt --all --check: clean
2636 tests pass.
2026-04-27 11:17:31 +00:00
dave 56c979c950 config: tell mergemaster to use 5-min sleeps between merge_agent_work polls
Real cause of mergemaster turn-burnout: not merge conflicts, just polling
overhead. The server-side tool_merge_agent_work IS designed to block until
the merge completes, but the MCP client times out after 60s. The agent
then polls get_merge_status, with 30-60s sleeps between polls — each
poll cycle costs 2 turns (sleep + tool call). The merge takes 5-10 min
for a clean run, so the agent burns 10-20 turns just waiting.

Updated workflow tells mergemaster:
- 'operation timed out' is normal, do NOT immediately re-call (would queue
  a duplicate merge)
- Use Bash sleep 300 (one 5-min wait = 1 turn) between polls
- Cap at 3 polls = 15 minutes total, plenty for any clean merge
- Reserve turns for actual fix-up work if gates fail

Combined with the earlier 30→60 turn / $5→$15 budget bump, this should
land any merge with no real conflicts in 3-5 turns total. Plenty of
headroom remaining for genuine gate-fix work.
2026-04-27 10:50:44 +00:00
dave 7b305ba892 config: bump mergemaster max_turns 30→60, budget $5→$15
30 turns is too tight for non-trivial merge gate failures. Combined with
the 3-retry cap, stories with any post-merge fix-up needed (cargo fmt
nits, slightly out-of-date diffs after parallel merges, etc.) get
permanently blocked.

This is a stopgap until story 668 lands (which will keep gates_passed=false
work in the coder stage entirely, so mergemaster only ever sees clean
diffs and the original 30 turns / $5 is fine again).
2026-04-27 10:41:45 +00:00
dave 7408cc5b4b fix(crdt_snapshot): per-thread SNAPSHOT_STATE in cfg(test) instead of shared static (bug 669)
Replaces the test-time GLOBAL_STATE_LOCK approach (which was just disguised
single-threading) with proper test isolation: each test thread gets its own
SnapshotState via a thread_local!.

Pattern matches crdt_state::CRDT_STATE_TL — production keeps the global
OnceLock; tests get a per-thread OnceLock that's accessed through a
snapshot_state() helper. The unsafe `&*ptr` cast to 'static is safe because
the thread_local lives as long as the spawning test thread.

The race: latest_snapshot_available_after_compaction captured at_seq from a
freshly-generated snapshot, then asserted it equalled SNAPSHOT_STATE's
latest.at_seq. With shared SNAPSHOT_STATE, another test thread's
apply_compaction could overwrite latest_snapshot between capture and read.
Per-thread state eliminates the race at its source.

ALL_OPS / VECTOR_CLOCK stay shared — the tests don't assert on absolute
counts, only on (this-thread's at_seq) == (this-thread's latest.at_seq).

5 consecutive default-parallel `cargo test --bin huskies` runs all green
at 2636/2636.
2026-04-27 02:49:53 +00:00
dave fc71c22305 Revert "fix(crdt_snapshot): serialise tests that share global SNAPSHOT_STATE / ALL_OPS / VECTOR_CLOCK (bug 669)"
This reverts commit 8e608feec1.
2026-04-27 02:45:01 +00:00
dave 8e608feec1 fix(crdt_snapshot): serialise tests that share global SNAPSHOT_STATE / ALL_OPS / VECTOR_CLOCK (bug 669)
The crdt_snapshot tests share three global statics:
- SNAPSHOT_STATE (latest_snapshot, pending_acks, pending_at_seq) — coordination state
- crdt_state::ALL_OPS / VECTOR_CLOCK — op journal + vector clock

Only the per-thread CRDT is thread-local (init_for_test); these other globals
are shared across test threads. Under default cargo test parallelism, two tests
running concurrently interleave their op writes and snapshot generation, so
assertions like assert_eq!(at_seq, 4) fail with at_seq=5 (the other thread's
ops snuck in).

Add a module-level GLOBAL_STATE_LOCK that all 17 affected tests grab at the
top. unwrap_or_else(|e| e.into_inner()) handles the case where a prior test
panicked while holding the lock (poisoned).

Fixes bug 669 — these two tests were the silent killer behind every agent's
script/test failure (see also bug 668, which advanced agents to merge despite
gates_passed=false; that compounded this by sending failing-tests worktrees
to mergemaster).

All 2636 tests now pass under default parallel execution (no --test-threads=1
needed).

Closes #669.
2026-04-27 02:43:49 +00:00
dave 404fd396f5 refactor: split chat/transport/whatsapp/commands.rs (837) into mod + llm
The 837-line commands.rs is split:

- llm.rs: handle_llm_message (LLM turn for non-command messages, ~195 lines)
- mod.rs: handle_incoming_message + tests (~660 lines)

Tests stay co-located with handle_incoming_message in mod.rs. All 2636 tests pass; clippy clean.
2026-04-27 02:37:22 +00:00
dave 1f02de8cd0 refactor: split chat/transport/slack/commands.rs (875) into mod + llm
The 875-line commands.rs is split:

- llm.rs: handle_llm_message (LLM turn for non-command messages, ~190 lines)
- mod.rs: SlackSlashCommandPayload + slash_command_to_bot_keyword + handle_incoming_message + tests (~700 lines)

Tests stay co-located with handle_incoming_message in mod.rs. All 2636 tests pass; clippy clean.
2026-04-27 02:32:11 +00:00
dave d07728f22b refactor: split chat/transport/matrix/bot/messages.rs (912) into mod + on_room_message + handle_message
The 912-line messages.rs is split:

- on_room_message.rs: incoming Matrix event dispatch (~600 lines)
- handle_message.rs: LLM turn + reply streaming (~265 lines)
- mod.rs: format_user_prompt + tests (~70 lines)

Tests stay co-located with format_user_prompt in mod.rs.

All 2636 tests pass; clippy clean.
2026-04-27 02:21:54 +00:00
dave adf936be07 refactor: split http/workflow/story_ops.rs (1256) into create + criterion + update
The 1256-line story_ops.rs is split:

- create.rs: create_story_file + tests (~232 lines)
- criterion.rs: check/add/remove/edit_criterion_in_file + tests (~525 lines)
- update.rs: update_story_in_file + yaml helpers + tests (~640 lines)
- mod.rs: re-exports (~12 lines)

Workflow helpers (read_story_content, write_story_content, slugify_name, etc.)
bumped from pub(super) to pub(crate) since they're now consumed across nested
sub-modules and from http/mcp/story_tools/.

Tests stay co-located. All 2636 tests pass; clippy clean.
2026-04-27 02:13:31 +00:00
dave 34a399b838 refactor: split http/mcp/shell_tools.rs (1144) into mod + exec + script
The 1144-line shell_tools.rs is split:

- exec.rs: validate_working_dir + tool_run_command + handle_run_command_sse
  + their tests (~550 lines)
- script.rs: tool_run_tests + tool_get_test_result + tool_run_build +
  tool_run_lint + helpers + their tests (~610 lines)
- mod.rs: re-exports (~12 lines)

Tests stay co-located. All 2636 tests pass; clippy clean.
2026-04-27 02:04:04 +00:00
dave 928d613190 refactor: split http/mcp/agent_tools.rs (1094) into mod + worktree
The 1094-line agent_tools.rs is split:

- worktree.rs: tool_create/list/remove_worktree, tool_get_editor_command,
  get_worktree_commits + their tests (~190 lines)
- mod.rs: agent lifecycle tools (start/stop/list/output/config/wait/
  remaining_turns_and_budget/read_coverage helper) + their tests

Tests stay co-located. All 2636 tests pass; clippy clean.
2026-04-27 01:57:46 +00:00
dave a8ead9cd10 refactor: split http/mcp/diagnostics.rs (861) into mod + permission + usage
The 861-line diagnostics.rs is split:

- permission.rs: tool_prompt_permission + helpers + their tests (584 lines)
- usage.rs: tool_get_token_usage + tests (122 lines)
- mod.rs: server_logs, rebuild, version, loc_file, dump_crdt, move_story + tests (185 lines)

Tests stay co-located. The bigger sub-modules (permission at 584 with tests
mostly under 800; usage at 122) are well within the 800-line guide.

Also added #[allow(unused_imports)] to two now-pedantic re-exports in
service/diagnostics/mod.rs that the split made flag.

All 2636 tests pass; clippy clean.
2026-04-27 01:51:36 +00:00
dave 9fbbfcd585 huskies: merge 667_story_agent_prompt_target_maximum_file_size_of_800_lines_as_a_soft_guide_decompose_larger_files_by_concern 2026-04-27 01:37:52 +00:00
dave a1afe069fa chore: remove test_fail.txt accidentally committed 2026-04-27 01:32:49 +00:00