# Story Kit: The Story-Driven Test Workflow (SDTW) **Target Audience:** Large Language Models (LLMs) acting as Senior Engineers. **Goal:** To maintain long-term project coherence, prevent context window exhaustion, and ensure high-quality, testable code generation in large software projects. --- ## 0. First Steps (For New LLM Sessions) When you start a new session with this project: 1. **Check for MCP Tools:** Read `.mcp.json` to discover the MCP server endpoint. Then list available tools by calling: ```bash curl -s "$(jq -r '.mcpServers["story-kit"].url' .mcp.json)" \ -H 'Content-Type: application/json' \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' ``` This returns the full tool catalog (create stories, spawn agents, record tests, manage worktrees, etc.). Familiarize yourself with the available tools before proceeding. These tools allow you to directly manipulate the workflow and spawn subsidiary agents without manual file manipulation. 2. **Read Context:** Check `.story_kit/specs/00_CONTEXT.md` for high-level project goals. 3. **Read Stack:** Check `.story_kit/specs/tech/STACK.md` for technical constraints and patterns. 4. **Check Work Items:** Look at `.story_kit/work/1_upcoming/` and `.story_kit/work/2_current/` to see what work is pending. --- ## 1. The Philosophy We treat the codebase as the implementation of a **"Living Specification."** driven by **User Stories** Instead of ephemeral chat prompts ("Fix this", "Add that"), we work through persistent artifacts. * **Stories** define the *Change*. * **Tests** define the *Truth*. * **Code** defines the *Reality*. **The Golden Rule:** You are not allowed to write code until the Acceptance Criteria are captured in the story. --- ## 1.5 MCP Tools Agents have programmatic access to the workflow via MCP tools served at `POST /mcp`. The project `.mcp.json` registers this endpoint automatically so Claude Code sessions and spawned agents can call tools like `create_story`, `validate_stories`, `list_upcoming`, `get_story_todos`, `record_tests`, `ensure_acceptance`, `start_agent`, `stop_agent`, `list_agents`, and `get_agent_output` without parsing English instructions. **To discover what tools are available:** Check `.mcp.json` for the server endpoint, then use the MCP protocol to list available tools. --- ## 2. Directory Structure ```text project_root/ .mcp.json # MCP server configuration (if MCP tools are available) .story_kit/ ├── README.md # This document ├── project.toml # Agent configuration (roles, models, prompts) ├── work/ # Unified work item pipeline (stories, bugs, spikes) │ ├── 1_upcoming/ # New work items awaiting implementation │ ├── 2_current/ # Work in progress │ ├── 3_qa/ # QA review │ ├── 4_merge/ # Ready to merge to master │ ├── 5_done/ # Merged and completed (auto-swept to 6_archived after 4 hours) │ └── 6_archived/ # Long-term archive ├── worktrees/ # Agent worktrees (managed by the server) ├── specs/ # Minimal guardrails (context + stack) │ ├── 00_CONTEXT.md # High-level goals, domain definition, and glossary │ ├── tech/ # Implementation details (Stack, Architecture, Constraints) │ │ └── STACK.md # The "Constitution" (Languages, Libs, Patterns) │ └── functional/ # Domain logic (Platform-agnostic behavior) │ └── ... └── src/ # The Code ``` ### Work Items All work items (stories, bugs, spikes) live in the same `work/` pipeline. Items are named: `{id}_{type}_{slug}.md` * Stories: `57_story_live_test_gate_updates.md` * Bugs: `4_bug_run_button_does_not_start_agent.md` * Spikes: `61_spike_filesystem_watcher_architecture.md` Items move through stages by moving the file between directories: `1_upcoming` → `2_current` → `3_qa` → `4_merge` → `5_done` → `6_archived` Items in `5_done` are auto-swept to `6_archived` after 4 hours by the server. ### Filesystem Watcher The server watches `.story_kit/work/` for changes. When a file is created, moved, or modified, the watcher auto-commits with a deterministic message and broadcasts a WebSocket notification to the frontend. This means: * MCP tools only need to write/move files — the watcher handles git commits * IDE drag-and-drop works (drag a story from `1_upcoming/` to `2_current/`) * The frontend updates automatically without manual refresh --- ## 3. The Cycle (The "Loop") When the user asks for a feature, follow this 4-step loop strictly: ### Step 1: The Story (Ingest) * **User Input:** "I want the robot to dance." * **Action:** Create a story via MCP tool `create_story` (guarantees correct front matter and auto-assigns the story number). * **Front Matter (Required):** Every work item file MUST begin with YAML front matter containing a `name` field: ```yaml --- name: Short Human-Readable Story Name --- ``` * **Move to Current:** Once the story is validated and ready for coding, move it to `work/2_current/`. * **Tracking:** Mark Acceptance Criteria as tested directly in the story file as tests are completed. * **Content:** * **User Story:** "As a user, I want..." * **Acceptance Criteria:** Bullet points of observable success. * **Out of scope:** Things that are out of scope so that the LLM doesn't go crazy * **Story Quality (INVEST):** Stories should be Independent, Negotiable, Valuable, Estimable, Small, and Testable. * **Git:** The `start_agent` MCP tool automatically creates a worktree under `.story_kit/worktrees/`, checks out a feature branch, moves the story to `work/2_current/`, and spawns the agent. No manual branch or worktree creation is needed. ### Step 2: The Implementation (Code) * **Action:** Write the code to satisfy the approved tests and Acceptance Criteria. * **Constraint:** adhere strictly to `specs/tech/STACK.md` (e.g., if it forbids certain patterns, you must not use them). * **Full-Stack Completion:** Every story must be completed across all components of the stack. If a feature touches the backend, frontend, and API layer, all three must be fully implemented and working end-to-end before the story can be accepted. Partial implementations (e.g., backend logic with no frontend wiring, or UI scaffolding with no real data) do not satisfy acceptance criteria. ### Step 3: Verification (Close) * **Action:** For each Acceptance Criterion in the story, write a failing test (red), mark the criterion as tested, make the test pass (green), and refactor if needed. Keep only one failing test at a time. * **Action:** Run compilation and make sure it succeeds without errors. Consult `specs/tech/STACK.md` and run all required linters listed there (treat warnings as errors). Run tests and make sure they all pass before proceeding. Ask questions here if needed. * **Action:** Do not accept stories yourself. Ask the user if they accept the story. If they agree, move the story file to `work/5_done/`. * **Move to Done:** After acceptance, move the story from `work/2_current/` (or `work/4_merge/`) to `work/5_done/`. * **Action:** When the user accepts: 1. Move the story file to `work/5_done/` 2. Commit both changes to the feature branch 3. Perform the squash merge: `git merge --squash feature/story-name` 4. Commit to master with a comprehensive commit message 5. Delete the feature branch: `git branch -D feature/story-name` * **Important:** Do NOT mark acceptance criteria as complete before user acceptance. Only mark them complete when the user explicitly accepts the story. **CRITICAL - NO SUMMARY DOCUMENTS:** * **NEVER** create a separate summary document (e.g., `STORY_XX_SUMMARY.md`, `IMPLEMENTATION_NOTES.md`, etc.) * **NEVER** write terminal output to a markdown file for "documentation purposes" * Tests are the primary source of truth. Keep test coverage and Acceptance Criteria aligned after each story. * If you find yourself typing `cat << 'EOF' > SUMMARY.md` or similar, **STOP IMMEDIATELY**. * The only files that should exist after story completion: * Updated code in `src/` * Updated guardrails in `specs/` (if needed) * Archived work item in `work/5_done/` (server auto-sweeps to `work/6_archived/` after 4 hours) --- ## 3.5. Bug Workflow (Simplified Path) Not everything needs to be a full story. Simple bugs can skip the story process: ### When to Use Bug Workflow * Defects in existing functionality (not new features) * State inconsistencies or data corruption * UI glitches that don't require spec changes * Performance issues with known fixes ### Bug Process 1. **Document Bug:** Create a bug file in `work/1_upcoming/` named `{id}_bug_{slug}.md` with: * **Symptom:** What the user observes * **Root Cause:** Technical explanation (if known) * **Reproduction Steps:** How to trigger the bug * **Proposed Fix:** Brief technical approach * **Workaround:** Temporary solution if available 2. **Start an Agent:** Use the `start_agent` MCP tool to create a worktree and spawn an agent for the bug fix. 3. **Write a Failing Test:** Before fixing the bug, write a test that reproduces it (red). This proves the bug exists and prevents regression. 4. **Fix the Bug:** Make minimal code changes to make the test pass (green). 5. **User Testing:** Let the user verify the fix in the worktree before merging. Do not proceed until they confirm. 6. **Archive & Merge:** Move the bug file to `work/5_done/`, squash merge to master, delete the worktree and branch. 7. **No Guardrail Update Needed:** Unless the bug reveals a missing constraint ### Bug vs Story vs Spike * **Bug:** Existing functionality is broken → Fix it * **Story:** New functionality is needed → Test it, then build it * **Spike:** Uncertainty/feasibility discovery → Run spike workflow --- ## 3.6. Spike Workflow (Research Path) Not everything needs a story or bug fix. Spikes are time-boxed investigations to reduce uncertainty. ### When to Use a Spike * Unclear root cause or feasibility * Need to compare libraries/encoders/formats * Need to validate performance constraints ### Spike Process 1. **Document Spike:** Create a spike file in `work/1_upcoming/` named `{id}_spike_{slug}.md` with: * **Question:** What you need to answer * **Hypothesis:** What you expect to be true * **Timebox:** Strict limit for the research * **Investigation Plan:** Steps/tools to use * **Findings:** Evidence and observations * **Recommendation:** Next step (Story, Bug, or No Action) 2. **Execute Research:** Stay within the timebox. No production code changes. 3. **Escalate if Needed:** If implementation is required, open a Story or Bug and follow that workflow. 4. **Archive:** Move the spike file to `work/5_done/`. ### Spike Output * Decision and evidence, not production code * Specs updated only if the spike changes system truth --- ## 4. Context Reset Protocol When the LLM context window fills up (or the chat gets slow/confused): 1. **Stop Coding.** 2. **Instruction:** Tell the user to open a new chat. 3. **Handoff:** The only context the new LLM needs is in the `specs/` folder and `.mcp.json`. * *Prompt for New Session:* "I am working on Project X. Read `.mcp.json` to discover available tools, then read `specs/00_CONTEXT.md` and `specs/tech/STACK.md`. Then look at `work/1_upcoming/` and `work/2_current/` to see what is pending." --- ## 5. Setup Instructions (For the LLM) If a user hands you this document and says "Apply this process to my project": 1. **Check for MCP Tools:** Look for `.mcp.json` in the project root. If it exists, you have programmatic access to workflow tools and agent spawning capabilities. 2. **Analyze the Request:** Ask for the high-level goal ("What are we building?") and the tech preferences ("Rust or Python?"). 3. **Git Check:** Check if the directory is a git repository (`git status`). If not, run `git init`. 4. **Scaffold:** Run commands to create the `work/` and `specs/` folders with the 6-stage pipeline (`work/1_upcoming/` through `work/6_archived/`). 5. **Draft Context:** Write `specs/00_CONTEXT.md` based on the user's answer. 6. **Draft Stack:** Write `specs/tech/STACK.md` based on best practices for that language. 7. **Wait:** Ask the user for "Story #1". --- ## 6. Code Quality **MANDATORY:** Before completing Step 3 (Verification) of any story, you MUST run all applicable linters, formatters, and test suites and fix ALL errors and warnings. Zero tolerance for warnings or errors. **AUTO-RUN CHECKS:** Always run the required lint/test/build checks as soon as relevant changes are made. Do not ask for permission to run them—run them automatically and fix any failures. **ALWAYS FIX DIAGNOSTICS:** At every stage, you must proactively fix all errors and warnings without waiting for user confirmation. Do not pause to ask whether to fix diagnostics—fix them immediately as part of the workflow. **Consult `specs/tech/STACK.md`** for the specific tools, commands, linter configurations, and quality gates for this project. The STACK file is the single source of truth for what must pass before a story can be accepted.