Files
storkit/.story_kit
Dave 1b71449dd0 Story 44: Agent Completion Report via MCP
- report_completion MCP tool for agents to signal done
- Rejects if worktree has uncommitted changes
- Runs acceptance gates (clippy, tests) automatically
- Stores completion status on agent record
- 10 new tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 15:02:34 +00:00
..

Story Kit: The Story-Driven Test Workflow (SDTW)

Target Audience: Large Language Models (LLMs) acting as Senior Engineers. Goal: To maintain long-term project coherence, prevent context window exhaustion, and ensure high-quality, testable code generation in large software projects.


0. First Steps (For New LLM Sessions)

When you start a new session with this project:

  1. Check for MCP Tools: Read .mcp.json to discover available programmatic tools. If it exists, you have direct access to workflow automation tools (create stories, spawn agents, record tests, etc.) via the MCP server.
  2. Read Context: Check .story_kit/specs/00_CONTEXT.md for high-level project goals.
  3. Read Stack: Check .story_kit/specs/tech/STACK.md for technical constraints and patterns.
  4. Check Stories: Look at .story_kit/stories/upcoming/ and .story_kit/stories/current/ to see what work is pending.

Why This Matters: The .mcp.json file indicates that you have programmatic access to tools like create_story, start_agent, list_agents, get_agent_output, and more. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation.


1. The Philosophy

We treat the codebase as the implementation of a "Living Specification." driven by User Stories Instead of ephemeral chat prompts ("Fix this", "Add that"), we work through persistent artifacts.

  • Stories define the Change.
  • Tests define the Truth.
  • Code defines the Reality.

The Golden Rule: You are not allowed to write code until the Acceptance Criteria are captured in the story and the test plan is approved.


1.5 MCP Tools

Agents have programmatic access to the workflow via MCP tools served at POST /mcp. The project .mcp.json registers this endpoint automatically so Claude Code sessions and spawned agents can call tools like create_story, validate_stories, list_upcoming, get_story_todos, record_tests, ensure_acceptance, start_agent, stop_agent, list_agents, and get_agent_output without parsing English instructions.

To discover what tools are available: Check .mcp.json for the server endpoint, then use the MCP protocol to list available tools.


2. Directory Structure

When initializing a new project under this workflow, create the following structure immediately:

project_root/
  .mcp.json              # MCP server configuration (if MCP tools are available)
  .story_kit/
  |-- README.md          # This document
  ├── stories/           # Story workflow (upcoming/current/archived).
  ├── specs/             # Minimal guardrails (context + stack).
  │   ├── README.md      # Explains this workflow to future sessions.
  │   ├── 00_CONTEXT.md  # High-level goals, domain definition, and glossary.
  │   ├── tech/          # Implementation details (Stack, Architecture, Constraints).
  │   │   └── STACK.md   # The "Constitution" (Languages, Libs, Patterns).
  │   └── functional/    # Domain logic (Platform-agnostic behavior).
  │       ├── 01_CORE.md
  │       └── ...
└── src/               # The Code.

3. The Cycle (The "Loop")

When the user asks for a feature, follow this 4-step loop strictly:

Step 1: The Story (Ingest)

  • User Input: "I want the robot to dance."
  • Action: Create a story via MCP tool create_story (preferred — guarantees correct front matter and auto-assigns the story number). Alternatively, create a file manually in stories/upcoming/ (e.g., stories/upcoming/XX_robot_dance.md).
  • Front Matter (Required): Every story file MUST begin with YAML front matter containing name and test_plan fields:
    ---
    name: Short Human-Readable Story Name
    test_plan: pending
    ---
    
    The test_plan field tracks approval status: pendingapproved (after Step 2).
  • Move to Current: Once the story is validated and ready for coding, move it to stories/current/.
  • Tracking: Mark Acceptance Criteria as tested directly in the story file as tests are completed.
  • Content:
    • User Story: "As a user, I want..."
    • Acceptance Criteria: Bullet points of observable success.
    • Out of scope: Things that are out of scope so that the LLM doesn't go crazy
  • Story Quality (INVEST): Stories should be Independent, Negotiable, Valuable, Estimable, Small, and Testable.
  • Git: Make a local feature branch for the story, named from the story (e.g., feature/story-33-camera-format-auto-selection). You must create and switch to the feature branch before making any edits.

Step 2: Test Planning (TDD)

  • Action: Define the test plan for the Story before any implementation.
  • Logic:
    • Identify required unit tests and integration tests.
    • Confirm test frameworks and commands from specs/tech/STACK.md.
    • Ensure Acceptance Criteria are testable and mapped to planned tests.
    • Each Acceptance Criterion must be traceable to a specific test.
  • Output: Show the user the test plan. Wait for approval.

Step 3: The Implementation (Code)

  • Action: Write the code to satisfy the approved tests and Acceptance Criteria.
  • Constraint: adhere strictly to specs/tech/STACK.md (e.g., if it says "No unwrap()", you must not use unwrap()).
  • Full-Stack Completion: Every story must be completed across all components of the stack. If a feature touches the backend, frontend, and API layer, all three must be fully implemented and working end-to-end before the story can be accepted. Partial implementations (e.g., backend logic with no frontend wiring, or UI scaffolding with no real data) do not satisfy acceptance criteria.

Step 4: Verification (Close)

  • Action: For each Acceptance Criterion in the story, write a failing test (red), mark the criterion as tested, make the test pass (green), and refactor if needed. Keep only one failing test at a time.
  • Action: Run compilation and make sure it succeeds without errors. Consult specs/tech/STACK.md and run all required linters listed there (treat warnings as errors). Run tests and make sure they all pass before proceeding. Ask questions here if needed.
  • Action: Do not accept stories yourself. Ask the user if they accept the story. If they agree, move the story file to stories/archived/. Tell the user they should commit (this gives them the chance to exclude files via .gitignore if necessary).
  • Move to Archived: After acceptance, move the story from stories/current/ to stories/archived/.
  • Action: When the user accepts:
    1. Move the story file to stories/archived/ (e.g., mv stories/current/XX_story_name.md stories/archived/)
    2. Commit both changes to the feature branch
    3. Perform the squash merge: git merge --squash feature/story-name
    4. Commit to master with a comprehensive commit message
    5. Delete the feature branch: git branch -D feature/story-name
  • Important: Do NOT mark acceptance criteria as complete before user acceptance. Only mark them complete when the user explicitly accepts the story.

CRITICAL - NO SUMMARY DOCUMENTS:

  • NEVER create a separate summary document (e.g., STORY_XX_SUMMARY.md, IMPLEMENTATION_NOTES.md, etc.)
  • NEVER write terminal output to a markdown file for "documentation purposes"
  • Tests are the primary source of truth. Keep test coverage and Acceptance Criteria aligned after each story.
  • If you find yourself typing cat << 'EOF' > SUMMARY.md or similar, STOP IMMEDIATELY.
  • The only files that should exist after story completion:
    • Updated code in src/
    • Updated guardrails in specs/ (if needed)
    • Archived story in stories/archived/

3.5. Bug Workflow (Simplified Path)

Not everything needs to be a full story. Simple bugs can skip the story process:

When to Use Bug Workflow

  • Defects in existing functionality (not new features)
  • State inconsistencies or data corruption
  • UI glitches that don't require spec changes
  • Performance issues with known fixes

Bug Process

  1. Document Bug: Create bugs/bug-N-short-description.md with:
    • Symptom: What the user observes
    • Root Cause: Technical explanation (if known)
    • Reproduction Steps: How to trigger the bug
    • Proposed Fix: Brief technical approach
    • Workaround: Temporary solution if available
  2. Create a Worktree: Create a git worktree and branch for the fix (e.g., git worktree add ../project-bug-N -b bugfix/bug-N-description master).
  3. Write a Failing Test: Before fixing the bug, write a test that reproduces it (red). This proves the bug exists and prevents regression.
  4. Fix the Bug: Make minimal code changes to make the test pass (green).
  5. User Testing: Let the user verify the fix in the worktree before merging. Do not proceed until they confirm.
  6. Archive & Merge: Move the bug file to bugs/archive/, squash merge to master, delete the worktree and branch.
  7. No Guardrail Update Needed: Unless the bug reveals a missing constraint

Bug vs Story

  • Bug: Existing functionality is broken → Fix it
  • Story: New functionality is needed → Test it, then build it
  • Spike: Uncertainty/feasibility discovery → Run spike workflow

3.6. Spike Workflow (Research Path)

Not everything needs a story or bug fix. Spikes are time-boxed investigations to reduce uncertainty.

When to Use a Spike

  • Unclear root cause or feasibility
  • Need to compare libraries/encoders/formats
  • Need to validate performance constraints

Spike Process

  1. Document Spike: Create spikes/spike-N-short-description.md with:
    • Question: What you need to answer
    • Hypothesis: What you expect to be true
    • Timebox: Strict limit for the research
    • Investigation Plan: Steps/tools to use
    • Findings: Evidence and observations
    • Recommendation: Next step (Story, Bug, or No Action)
  2. Execute Research: Stay within the timebox. No production code changes.
  3. Escalate if Needed: If implementation is required, open a Story or Bug and follow that workflow.
  4. Archive: Move completed spikes to spikes/archive/.

Spike Output

  • Decision and evidence, not production code
  • Specs updated only if the spike changes system truth

4. Context Reset Protocol

When the LLM context window fills up (or the chat gets slow/confused):

  1. Stop Coding.
  2. Instruction: Tell the user to open a new chat.
  3. Handoff: The only context the new LLM needs is in the specs/ folder and .mcp.json.
    • Prompt for New Session: "I am working on Project X. Read .mcp.json to discover available tools, then read specs/00_CONTEXT.md and specs/tech/STACK.md. Then look at stories/ to see what is pending."

5. Setup Instructions (For the LLM)

If a user hands you this document and says "Apply this process to my project":

  1. Check for MCP Tools: Look for .mcp.json in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities.
  2. Analyze the Request: Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?").
  3. Git Check: Check if the directory is a git repository (git status). If not, run git init.
  4. Scaffold: Run commands to create the specs/ and stories/ folders.
  5. Draft Context: Write specs/00_CONTEXT.md based on the user's answer.
  6. Draft Stack: Write specs/tech/STACK.md based on best practices for that language.
  7. Wait: Ask the user for "Story #1".

6. Code Quality Tools

MANDATORY: Before completing Step 4 (Verification) of any story, you MUST run all applicable linters and fix ALL errors and warnings. Zero tolerance for warnings or errors.

AUTO-RUN CHECKS: Always run the required lint/test/build checks as soon as relevant changes are made. Do not ask for permission to run them—run them automatically and fix any failures.

ALWAYS FIX DIAGNOSTICS: At every stage, you must proactively fix all errors and warnings without waiting for user confirmation. Do not pause to ask whether to fix diagnostics—fix them immediately as part of the workflow.

TypeScript/JavaScript: Biome

  • Tool: Biome - Fast formatter and linter
  • Check Command: npx @biomejs/biome check src/
  • Fix Command: npx @biomejs/biome check --write src/
  • Unsafe Fixes: npx @biomejs/biome check --write --unsafe src/
  • Configuration: biome.json in project root
  • When to Run:
    • After every code change to TypeScript/React files
    • Before committing any frontend changes
    • During Step 4 (Verification) - must show 0 errors, 0 warnings

Biome Rules to Follow:

  • No any types (use proper TypeScript types or unknown)
  • No array index as key in React (use stable IDs)
  • No assignments in expressions (extract to separate statements)
  • All buttons must have explicit type prop (button, submit, or reset)
  • Mouse events must be accompanied by keyboard events for accessibility
  • Use template literals instead of string concatenation
  • Import types with import type { } syntax
  • Organize imports automatically

Rust: Clippy

  • Tool: Clippy - Rust linter
  • Check Command: cargo clippy --all-targets --all-features
  • Fix Command: cargo clippy --fix --allow-dirty --allow-staged
  • When to Run:
    • After every code change to Rust files
    • Before committing any backend changes
    • During Step 4 (Verification) - must show 0 errors, 0 warnings

Clippy Rules to Follow:

  • No unused variables (prefix with _ if intentionally unused)
  • No dead code (remove or mark with #[allow(dead_code)] if used conditionally)
  • Use ? operator instead of explicit error handling where possible
  • Prefer if let over match for single-pattern matches
  • Use meaningful variable names
  • Follow Rust idioms and best practices

Build Verification Checklist

Before asking for user acceptance in Step 4:

  • Run cargo clippy (Rust) - 0 errors, 0 warnings
  • Run cargo check (Rust) - successful compilation
  • Run cargo test (Rust) - all tests pass
  • Run npx @biomejs/biome check src/ (TypeScript) - 0 errors, 0 warnings
  • Run npm run build (TypeScript) - successful build
  • Manually test the feature works as expected
  • All acceptance criteria verified

Failure to meet these criteria means the story is NOT ready for acceptance.