Merge story-31: View Upcoming Stories

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> # Conflicts: # frontend/src/api/workflow.ts # frontend/src/components/Chat.test.tsx # frontend/src/components/Chat.tsx # server/src/http/workflow.rs
2026-02-19 15:54:02 +00:00
parent 5f5c09461b 939387104b
commit 3807f0e625
11 changed files with 521 additions and 2 deletions
--- a/.story_kit/stories/archived/31_view_upcoming_stories.md
+++ b/.story_kit/stories/archived/31_view_upcoming_stories.md
@@ -1,6 +1,6 @@
 ---
 name: View Upcoming Stories
-test_plan: pending
+test_plan: approved
 ---

 # Story 31: View Upcoming Stories
--- a/.story_kit/stories/upcoming/34_agent_security_and_sandboxing.md
+++ b/.story_kit/stories/upcoming/34_agent_security_and_sandboxing.md
@@ -0,0 +1,28 @@
+# Story 34: Agent Security and Sandboxing
+
+## User Story
+**As a** supervisor orchestrating multiple autonomous agents,
+**I want to** constrain what each agent can access and do,
+**So that** agents can't escape their worktree, damage shared state, or perform unintended actions.
+
+## Acceptance Criteria
+- [ ] Agent creation accepts an `allowed_tools` list to restrict Claude Code tool access per agent.
+- [ ] Agent creation accepts a `disallowed_tools` list as an alternative to allowlisting.
+- [ ] Agents without Bash access can still perform useful coding work (Read, Edit, Write, Glob, Grep).
+- [ ] Investigate replacing direct Bash/shell access with Rust-implemented tool proxies that enforce boundaries:
+  - Scoped `exec_shell` that only runs allowlisted commands (e.g., `cargo test`, `npm test`) within the agent's worktree.
+  - Scoped `read_file` / `write_file` that reject paths outside the agent's worktree root.
+  - Scoped `git` operations that only work within the agent's worktree.
+- [ ] Evaluate `--max-turns` and `--max-budget-usd` as safety limits for runaway agents.
+- [ ] Document the trust model: what the supervisor controls vs what agents can do autonomously.
+
+## Questions to Explore
+- Can we use MCP (Model Context Protocol) to expose our Rust-implemented tools to Claude Code, replacing its built-in Bash/filesystem tools with scoped versions?
+- What's the right granularity for shell allowlists — command-level (`cargo test`) or pattern-level (`cargo *`)?
+- Should agents have read access outside their worktree (e.g., to reference shared specs) but write access only within it?
+- Is OS-level sandboxing (Docker, macOS sandbox profiles) worth the complexity for a personal tool?
+
+## Out of Scope
+- Multi-user authentication or authorization (single-user personal tool).
+- Network-level isolation between agents.
+- Encrypting agent communication channels (all local).