Commit Graph

3 Commits

Author SHA1 Message Date
dave 0b50a624b8 obs(session_store): log every record/lookup/remove for warm-resume diagnostics
Helps explain WHY each spawn goes warm vs cold. The existing
`spawn mode=warm|cold` log only shows the outcome at the spawn point —
to count where warmth is being lost, we need to see:
  - when a session_id is recorded (and for which key),
  - what every lookup returns (key + Some/None),
  - when remove_sessions_for_story prunes (which is currently the only
    explicit cold-induction path beyond "first ever spawn").

After this lands a grep of "session_store" in the logs gives the full
warm-resume health picture: which (story,agent,model) triples have a
recorded session, which lookups are hitting it, and which prunes are
costing us a warm respawn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 08:12:42 +00:00
dave 66f340a7a3 fix: prune session_store on stdio abort, respawn cold
The bug 882 abort-respawn safeguard caps consecutive crashes at 5 then
blocks the story — but the underlying stdio abort itself stays unfixed:
each respawn calls start_agent which reads session_store.json, finds the
prior session id, passes --resume to claude-code, and re-triggers the
same crash. Five identical respawns later, the story is blocked.

Now: when an abort+no-session exit triggers respawn, we first call
session_store::remove_sessions_for_story to drop every entry for the
story. The next spawn starts cold (no --resume), which avoids the
bloated stdio replay claude-code is choking on.

The function was already implemented but #[cfg(test)] only — promoted
to a non-test pub fn. Existing remove_sessions_for_story_cleans_up test
unchanged and still green.

Net effect: instead of "5 retries, then blocked", we get "1 abort, prune,
respawn cold, agent runs normally". The story can resume work without
losing its worktree state.
2026-04-30 18:19:01 +00:00
dave ac85cfce5d huskies: merge 652_story_pass_resume_session_id_on_agent_respawn_so_new_sessions_inherit_prior_reasoning 2026-04-27 11:27:50 +00:00