`tool_run_tests` in `server/src/http/mcp/shell_tools/script.rs` is fully
blocking server-side: it spawns the test child, polls every 1s server-side
until exit (or `TEST_TIMEOUT_SECS = 1200s`), and returns the full
{passed, exit_code, output} directly. There is NO async/started-status
return path.
But two places told agents the wrong story:
1. `tools_list/system_tools.rs` description claimed "Returns immediately
with status: started. Poll get_test_result..." — agents read tool
descriptions for protocol semantics, so they followed this and burned
turns polling get_test_result.
2. `agents.toml` had been correctly saying it blocks, but my last commit
(776aad38) "fixed" it the wrong way based on a misread of the code.
Now both say: run_tests blocks server-side, returns the full result, do
not poll get_test_result. get_test_result remains for external observers
(UI checking on a job another caller started).
Reverts the prompt change in 776aad38 with the correct text.
Master had 8 uncommitted single-line whitespace changes (blank-line trimming
in test mod headers, etc.) left over from a previous mergemaster cargo-fmt
run that didn't get committed. Each subsequent merge attempt hit:
cherry-pick failed: 'Your local changes to the following files would be
overwritten by merge. Please commit your changes or stash them.'
So merges had been silently un-mergeable for the last several rounds —
mergemaster correctly reported the issue but had no way to fix master's
own state from inside the merge_workspace.
Files affected (all whitespace-only):
- chat/transport/matrix/bot/messages/{handle_message,on_room_message}.rs
- chat/transport/slack/commands/{llm,mod}.rs
- http/mcp/agent_tools/worktree.rs
- http/workflow/story_ops/{create,criterion,update}.rs
cargo clippy --all-targets -- -D warnings: clean
cargo fmt --all --check: clean
2636 tests pass.
The 861-line diagnostics.rs is split:
- permission.rs: tool_prompt_permission + helpers + their tests (584 lines)
- usage.rs: tool_get_token_usage + tests (122 lines)
- mod.rs: server_logs, rebuild, version, loc_file, dump_crdt, move_story + tests (185 lines)
Tests stay co-located. The bigger sub-modules (permission at 584 with tests
mostly under 800; usage at 122) are well within the 800-line guide.
Also added #[allow(unused_imports)] to two now-pedantic re-exports in
service/diagnostics/mod.rs that the split made flag.
All 2636 tests pass; clippy clean.