fix(warm-resume): send non-empty -p prompt with --resume so watchdog

respawns can actually warm

claude-code's --resume <session_id> requires either:
  a) a deferred-tool marker in the resumed session (i.e. the prior
     session paused mid-tool-call), or
  b) a non-empty -p prompt to continue the conversation with.

Watchdog-killed sessions have neither: the kill is asynchronous and
leaves no deferred-tool marker, and our harness was passing an empty
-p (because `resume_context_owned` is None for the common respawn
case). claude-code then aborts with:

  "Error: No deferred tool marker found in the resumed session.
   Either the session was not deferred, the marker is stale (tool
   already ran), or it exceeds the tail-scan window. Provide a
   prompt to continue the conversation."

The harness sees an aborted CLI with no session, prunes the recorded
session_id, and respawns cold — paying the full prompt-cache miss for
EVERY respawn. The new session_store logging (commit 0b50a624) made
this 100% legible: every warm spawn we observed went `mode=warm` →
crash → prune → `mode=cold` within a couple of seconds.

Fix: when resuming with no failure-context to send, default the -p
prompt to a brief "continue from PLAN.md" line. claude-code now has a
valid continuation message and warm-resume should actually work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
dave
2026-05-13 08:27:02 +00:00
parent 0b50a624b8
commit bd517f2857
+12 -1
View File
@@ -263,7 +263,18 @@ pub(super) async fn run_agent_spawn(
Some(_) => { Some(_) => {
// Keep the full rendered prompt as fallback if resume fails. // Keep the full rendered prompt as fallback if resume fails.
let fallback = prompt; let fallback = prompt;
(resume_context_owned.unwrap_or_default(), Some(fallback)) // claude-code's --resume requires a non-empty -p prompt unless the
// resumed session has a deferred-tool marker. Watchdog-killed
// sessions don't have one, so an empty -p aborts the CLI with
// "No deferred tool marker found... Provide a prompt to continue
// the conversation." We default to a brief "continue from
// PLAN.md" message when no gate-failure context was provided.
let resume_msg = resume_context_owned.filter(|s| !s.is_empty()).unwrap_or_else(
|| {
"Continue from your prior session. Read PLAN.md for your current state and pick up where you left off.".to_string()
},
);
(resume_msg, Some(fallback))
} }
None => { None => {
if let Some(ctx) = resume_context_owned { if let Some(ctx) = resume_context_owned {