Commit Graph

266 Commits

Author SHA1 Message Date
dave bac07d28a7 fix: increase run_tests MCP timeout to 20 minutes to match acceptance gates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:43:31 +00:00
dave fc89be2f55 fix: server-side 20s blocking in get_test_result to prevent agent poll spam
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:29:38 +00:00
dave f958f57e56 fix: async run_tests to prevent zombie cargo processes blocking gates
run_tests MCP tool now spawns tests in the background and returns
immediately. Agents poll get_test_result to check completion. This
prevents zombie cargo processes from holding the build lock when the
CLI times out the MCP call before tests finish.

Also fixes agent permission mode: acceptEdits replaces invalid
allowFullAutoEdit that was causing agents to crash-loop on spawn.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:00:05 +00:00
dave e32300d1f8 fix: switch agent permission mode from bypassPermissions to allowFullAutoEdit
bypassPermissions ignored the worktree's .claude/settings.json entirely,
letting agents run any Bash command including cargo test (which they'd
spawn 4+ times concurrently, deadlocking on the build directory lock).

allowFullAutoEdit respects the settings.json allowlist, so agents can
only use the Bash commands we explicitly permit (cargo check, cargo
build, git) and must use MCP tools for everything else (run_tests,
run_lint, run_build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 20:23:22 +00:00
dave d06241c20c fix: merge_agent_work blocks until complete instead of requiring polling
The mergemaster agent was burning all 30 turns polling get_merge_status
every 2 seconds while the merge pipeline takes ~2 minutes. It would
exhaust turns, exit, restart, and repeat — never seeing the result.

merge_agent_work now blocks with a 10-second internal poll loop and
returns the final result directly. The agent calls it once and gets
the answer. No more polling turns wasted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:43:50 +00:00
dave 599fbdc71d huskies: merge 539_bug_crdt_event_bridge_still_writes_filesystem_shadow_files_after_530_eliminated_filesystem_state 2026-04-11 17:04:36 +00:00
dave 6998275331 huskies: merge 540_bug_get_agent_output_mcp_tool_returns_no_agent_for_exited_agents_instead_of_reading_session_logs_from_disk 2026-04-11 16:33:58 +00:00
dave eea54ca616 fix: thread-local CRDT and content store for test isolation
Tests shared a global CRDT singleton and content store HashMap, causing
flaky failures when parallel tests wrote items that polluted each
other's assertions. 3-5 random test failures per run.

Both CRDT_STATE and CONTENT_STORE now use thread_local! in test mode
so each test thread gets its own isolated instance. Production code
is unchanged — it still uses the global OnceLock singletons.

Also fixed 3 tests (create_story_writes_correct_content,
next_item_number_increments_from_existing_bugs,
next_item_number_scans_archived_too) that relied on leaked state
from other tests — they now write to the content store explicitly.

Result: 1902 passed, 0 failed across 5 consecutive runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 13:02:09 +00:00
dave ea36160667 fix: read_all_items must use deduplicated index, not raw CRDT entries
read_all_items was iterating all CRDT entries including stale duplicates
from earlier stage writes. A story written multiple times (backlog →
current → done) would appear in the output multiple times with different
stages, causing ghost entries in the pipeline status and backlog views.

Now iterates only the index (story_id → visible_index map) which
represents the latest-wins deduplicated view of each story.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:32:55 +00:00
dave 40893a8cb1 huskies: merge 535_bug_chat_status_number_and_mcp_tool_status_still_read_from_filesystem_broken_after_530 2026-04-10 19:01:31 +00:00
dave 6f7a0c7708 huskies: merge 479_story_build_agent_mode_with_crdt_based_work_claiming 2026-04-10 18:50:30 +00:00
dave 11d19d8902 huskies: merge 530_story_eliminate_filesystem_markdown_shadows_entirely_crdt_db_is_the_only_story_store 2026-04-10 14:59:58 +00:00
dave 1dd675796b huskies: merge 531_story_mcp_tool_to_read_agent_session_logs_from_disk_not_just_live_stream 2026-04-10 13:08:51 +00:00
dave 31388da609 huskies: merge 517_story_remove_filesystem_shadow_fallback_paths_from_lifecycle_rs_finish_the_migration_to_crdt_only 2026-04-10 13:00:25 +00:00
dave 61ae30873f huskies: merge 516_story_update_story_description_should_create_the_description_section_if_it_doesn_t_exist_instead_of_erroring 2026-04-10 10:28:53 +00:00
dave f015fe5a1d huskies: merge 515_story_add_a_debug_mcp_tool_to_dump_the_in_memory_crdt_state_for_inspection 2026-04-10 10:24:30 +00:00
dave c6b6be872b huskies: merge 509_bug_create_story_silently_drops_description_and_any_other_unknown_parameters_with_no_error 2026-04-10 10:20:13 +00:00
Timmy 92b212e7fd huskies: merge 504_story_update_story_front_matter_mcp_schema_should_accept_non_string_values_lists_bools_numbers
Squash merge of story 504: add MCP regression tests for non-string
front_matter values (arrays, bools, integers). The schema change itself
was already on master. Fixed the array assertion to match YAML's
space-after-comma inline sequence format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:08:21 +01:00
Timmy 9633ab35a6 fix: validate_story_dirs reads filesystem shadows instead of global CRDT singleton (bug 525)
The post-520 migration changed validate_story_dirs to read from
pipeline_state::read_all_typed() (the process-global CRDT singleton),
ignoring its root: &Path argument. This broke test isolation — tests
creating a tempdir saw dozens of results from ambient CRDT state,
causing non-deterministic failures that blocked every mergemaster gate.

Remove the CRDT singleton block and rely on the filesystem shadow scan
that already uses the root argument correctly. 1845/1845 tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 10:52:42 +01:00
dave c324452b38 fix: commit uncommitted native JSON type changes on master
These changes (HashMap<String, String> → HashMap<String, Value> for front matter,
json_value_to_yaml_scalar, and oneOf schema for front_matter) were left uncommitted
on master after a previous merge, blocking the cherry-pick step of story 509's merge.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 22:35:52 +00:00
dave 6f6d37e955 huskies: merge 514_story_delete_story_should_do_a_full_cleanup_crdt_op_db_row_filesystem_shadow_worktree_pending_timers 2026-04-09 22:05:18 +00:00
dave 84717b04bd huskies: merge 520_story_typed_pipeline_state_machine_in_rust_foundation_replaces_stringly_typed_crdt_views_with_strict_enums_subsumes_436 2026-04-09 21:27:48 +00:00
Timmy 1d9287389a feat(521): evict_item primitive + purge_story MCP tool
Adds the foundational capability to clear a story from the running
server's in-memory CRDT state without restarting the process. This is
story 521, motivated by the 2026-04-09 incident where stories 478 and
503 kept resurrecting from in-memory CRDT after every sqlite delete /
worktree removal / timers.json clear. The only previous remedy was a
full docker restart.

Changes:

  - server/src/crdt_state.rs: new `pub fn evict_item(story_id: &str)`.
    Looks up the item's CRDT OpId via the visible-index map, calls the
    bft-json-crdt list `delete()` primitive to construct a tombstone op,
    runs it through the existing `apply_and_persist` machinery (which
    signs, applies to the in-memory CRDT, and queues for persistence to
    crdt_ops), rebuilds the story_id → visible_index map, and drops the
    in-memory CONTENT_STORE entry. The tombstone survives a restart
    because it's persisted as a real CRDT op.

  - server/src/http/mcp/story_tools.rs: new `tool_purge_story` MCP
    handler that takes a story_id and calls evict_item. Deliberately
    minimal — does NOT touch agents, worktrees, pipeline_items shadow
    table, timers.json, or filesystem shadows. Compose with stop_agent,
    remove_worktree, etc. for a full purge. Story 514 (delete_story
    full cleanup) is the future "do it all" tool.

  - server/src/http/mcp/mod.rs: registers the `purge_story` tool in the
    tools list and dispatch table.

Usage:

    mcp__huskies__purge_story story_id="<full_story_id>"

Returns a string confirming the eviction. The story will no longer
appear in get_pipeline_status, list_agents, or any other API that
reads from the in-memory CRDT view, and on the next server restart
the persisted tombstone op will keep it from being reconstructed.

This is a prerequisite for story 514 (delete_story full cleanup) and
useful for any "kill it with fire" operator need.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:29:09 +01:00
Timmy 13635b01bc wip(501): timer cancellation infrastructure (parallel session WIP + main.rs wiring)
Bundles in-progress work from a parallel Claude session toward fixing
bug 501 (rate-limit retry timer doesn't cancel on stop_agent / move_story
/ successful completion). This commit lands the foundation but the MCP
tool wiring is still TODO.

  - server/src/chat/timer.rs: defense-in-depth check in tick_once that
    skips firing a timer for stories already past 3_qa (3_qa, 4_merge,
    5_done, 6_archived). The primary cancellation path will be in the
    MCP tools; this guards races where a timer was scheduled before the
    story was advanced and the tool didn't get a chance to cancel it.

  - server/src/http/context.rs: adds `timer_store: Arc<TimerStore>` field
    on AppContext so MCP tools (move_story, stop_agent, ...) can reach
    the shared timer store and cancel pending entries when the user
    intervenes manually. The test helper is updated to construct one.

  - server/src/main.rs: wires up a TimerStore instance in the AppContext
    initialiser so the binary actually compiles after the context.rs
    field addition. TODO: the matrix bot's spawn_bot still creates its
    own TimerStore instance (in chat/transport/matrix/bot/run.rs:220-227)
    rather than consuming the shared one — that refactor is the next
    step in the bug 501 fix.

What is NOT in this commit and is needed to actually fix bug 501:
  - The MCP tool side (move_story, stop_agent, delete_story) does not
    yet call timer_store.cancel(story_id) when invoked
  - The matrix bot's spawn_bot does not yet consume the shared
    timer_store from AppContext — it still creates its own

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 21:28:48 +01:00
Timmy 5765fb57be merge(478): WebSocket CRDT sync layer (manual squash from feature/story-478)
Manual squash-merge of feature/story-478_… into master after the in-pipeline
mergemaster runs failed silently. The 478 agent did substantial real work
across multiple respawn cycles before being interrupted; commits on the
feature branch were intact and verified high-quality but never merged via
the normal pipeline path due to compounding bugs:

- The first mergemaster attempt ran ($0.82 in tokens) and exited "Done"
  cleanly but didn't push anything to master — likely the worktree was
  briefly on master rather than the feature branch when the merge_agent_work
  MCP tool ran, so it found nothing to merge.
- Subsequent timer fires defaulted to spawning coders instead of resuming
  mergemaster, burning more tokens for no progress.
- Bug 510 (split-brain shadows yanking done stories back to current) and
  bug 501 (timers don't cancel on stop/completion) compounded the cost.

What this commit lands:
- server/src/crdt_sync.rs (new, ~518 lines): GET /crdt-sync WebSocket
  handler that subscribes to locally-applied SignedOps and streams them as
  binary frames. Per-peer bounded queue (256 ops) drops slow peers.
- server/src/crdt_state.rs: new public functions subscribe_ops(),
  all_ops_json(), apply_remote_op() backing the sync handler. Adds the
  CRDT_OP_TX broadcast channel (capacity 1024).
- server/src/main.rs: wires up the sync subsystem at startup.
- server/src/http/mod.rs: registers the new endpoint.
- server/src/config.rs: adds optional rendezvous field for outbound peers.
- server/src/worktree.rs: minor changes from the original branch.
- server/Cargo.toml: cfg lint suppression for CrdtNode derive.
- crates/bft-json-crdt/src/debug.rs: fix unused-variable warnings.

Resolved a trivial test-mod merge conflict in crdt_state.rs (both 478 and
503 added new tests at the end of the test module — kept both sets).

Note: this is the squash of the original 478 work that the user explicitly
authorized landing. The earlier rogue commit ac9f3ecf — which added a
DIFFERENT, broken implementation of the same feature directly to master
under the user's identity without consent — was reverted earlier in this
session. The forensic tags rogue-commit-2026-04-09-ac9f3ecf and
pre-502-reset-2026-04-09 still exist for incident audit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:46:29 +01:00
dave 41515e3b8f huskies: merge 503_bug_depends_on_pointing_at_an_archived_story_is_silently_treated_as_deps_met_surprising_users 2026-04-09 18:31:29 +00:00
dave 8fd49d563e huskies: merge 492_story_remove_filesystem_pipeline_state_and_store_story_content_in_database 2026-04-08 03:07:33 +00:00
dave 5c2769dd7d huskies: merge 491_story_watcher_fires_on_crdt_state_transitions_instead_of_filesystem_events 2026-04-08 01:18:30 +00:00
dave 753f7f1c92 fix: comment out premature db::crdt references that broke build
The 490 merge introduced references to a db::crdt module that doesn't
exist yet (it's part of story 491). Commented out with TODO(491)
markers so master compiles. The crdt_state.rs module from 490 is
intact — these are just the call sites that will be wired up when
491 lands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:49:11 +00:00
dave 15a52d6d38 ignore kleppmann_trace test — 10+ min, 12GB RAM
Marked #[ignore] so cargo test skips it by default. Run manually with
--ignored flag when needed for benchmarking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:15:38 +00:00
dave c73153dd4e huskies: merge 490_story_crdt_state_layer_backed_by_sqlite
CRDT state layer backed by SQLite for pipeline state. Integrates the
BFT JSON CRDT crate with SQLite persistence via sqlx. Ops are persisted
and replayed on startup. Node identity via Ed25519 keypair.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:12:19 +00:00
dave 19768c23d5 huskies: merge 494_story_mcp_tool_to_run_project_test_suite 2026-04-07 14:43:41 +00:00
dave a3a3942b0a huskies: merge 493_bug_story_dependency_chain_not_firing_due_to_front_matter_format_issues 2026-04-07 13:32:38 +00:00
dave 4e082009c2 huskies: merge 487_story_display_story_dependencies_in_web_ui_and_chat_commands 2026-04-07 11:49:57 +00:00
dave 7a82a411ec huskies: merge 483_bug_timer_slash_command_not_wired_up_in_web_ui 2026-04-04 21:33:16 +00:00
Timmy 2d8ccb3eb6 huskies: rename project from storkit to huskies
Rename all references from storkit to huskies across the codebase:
- .storkit/ directory → .huskies/
- Binary name, Cargo package name, Docker image references
- Server code, frontend code, config files, scripts
- Fix script/test to build frontend before cargo clippy/test
  so merge worktrees have frontend/dist available for RustEmbed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:12:52 +01:00
dave 641384e794 storkit: merge 462_bug_stage_transition_notifications_can_arrive_out_of_order_and_show_wrong_story_name 2026-04-03 12:04:58 +00:00
dave 967a306ea8 storkit: merge 457_bug_store_json_created_at_project_root_instead_of_inside_storkit 2026-04-02 13:27:46 +00:00
dave 57e0197d75 storkit: merge 449_bug_oauth_callback_url_ignores_port_cli_flag 2026-03-31 14:55:46 +00:00
dave fec417cb16 storkit: merge 433_story_setup_wizard_interviews_user_on_bare_projects_with_no_existing_code 2026-03-29 00:46:05 +00:00
dave 5992f9bd19 storkit: merge 438_story_slash_command_autocomplete_in_web_ui_text_input 2026-03-28 22:27:40 +00:00
dave ddc4a57cd2 storkit: merge 444_refactor_extract_shared_test_helpers_test_ctx_write_story_file_make_api 2026-03-28 19:51:17 +00:00
dave fc160b5c5f feat: wizard detects bare projects and prompts user interview for context/stack
wizard_generate now checks if the project has no source code. On bare
projects, the generation hints tell the LLM to ask the user what they
want to build and what tech stack they plan to use, rather than trying
to read a nonexistent codebase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:17:42 +00:00
dave 49b78f3642 storkit: merge 432_story_complete_setup_wizard_with_mcp_tools_and_agent_driven_file_generation 2026-03-28 14:23:59 +00:00
dave 0b50c66caa storkit: merge 429_story_interactive_project_setup_wizard_for_new_storkit_projects 2026-03-28 13:29:05 +00:00
dave 953fce2ca6 fix(426): verify cherry-pick landed on master before marking story done
After the cherry-pick step in run_squash_merge, verify:
1. project_root is on the base branch (not a merge-queue branch)
2. HEAD commit has actual code changes (not an empty/story-only diff)

If either check fails, return success=false so the story stays in merge
stage for retry instead of being phantom-advanced to done.

Also rename move_story_to_archived → move_story_to_done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 12:37:03 +00:00
dave 3639d64da6 fix(424): add throttled field to all StoryAgent ctors and handle HardBlock in ws.rs
The initial commit added the `throttled` field to `StoryAgent` but missed
several construction sites in lifecycle.rs, test_helpers.rs, and scan.rs.
Also adds the `HardBlock` match arm in the WebSocket event conversion and
minor CSS/import ordering fixes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 11:33:43 +00:00
dave 8ab2e19e98 fix(423): handle RateLimitHardBlock in ws.rs match
The new WatcherEvent::RateLimitHardBlock variant added in the feature
commit was not covered in the ws.rs From<WatcherEvent> match, causing
a compile error. Add the missing arm returning None (same as
RateLimitWarning — handled by chat notifications only, not WebSocket).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 10:11:44 +00:00
dave 6c6bc35785 feat: add unblock command and MCP tool to reset blocked stories
- Add `unblock` bot command (chat + web UI slash command) that clears the
  `blocked` flag and resets `retry_count` to 0 in story front matter
- Works across all pipeline stages (1_backlog through 6_archived)
- Returns confirmation with story name and ID, or clear error if story
  is not found or not blocked
- Expose `unblock_story` MCP tool for programmatic use by agents
- Make `chat::commands::unblock` module pub(crate) so story_tools can
  call `unblock_by_number`
- Add 8 unit tests covering registration, validation, core logic, and
  edge cases (not-found, not-blocked, any stage, story ID in response)
- Update MCP tools list test: 49 → 50 tools
2026-03-28 10:05:51 +00:00
dave 98b5475160 storkit: merge 425_story_chat_notification_when_a_story_blocks_with_reason 2026-03-28 09:38:47 +00:00