diff --git a/.story_kit/stories/upcoming/34_agent_security_and_sandboxing.md b/.story_kit/stories/upcoming/34_agent_security_and_sandboxing.md deleted file mode 100644 index bf9b51f..0000000 --- a/.story_kit/stories/upcoming/34_agent_security_and_sandboxing.md +++ /dev/null @@ -1,28 +0,0 @@ -# Story 34: Agent Security and Sandboxing - -## User Story -**As a** supervisor orchestrating multiple autonomous agents, -**I want to** constrain what each agent can access and do, -**So that** agents can't escape their worktree, damage shared state, or perform unintended actions. - -## Acceptance Criteria -- [ ] Agent creation accepts an `allowed_tools` list to restrict Claude Code tool access per agent. -- [ ] Agent creation accepts a `disallowed_tools` list as an alternative to allowlisting. -- [ ] Agents without Bash access can still perform useful coding work (Read, Edit, Write, Glob, Grep). -- [ ] Investigate replacing direct Bash/shell access with Rust-implemented tool proxies that enforce boundaries: - - Scoped `exec_shell` that only runs allowlisted commands (e.g., `cargo test`, `npm test`) within the agent's worktree. - - Scoped `read_file` / `write_file` that reject paths outside the agent's worktree root. - - Scoped `git` operations that only work within the agent's worktree. -- [ ] Evaluate `--max-turns` and `--max-budget-usd` as safety limits for runaway agents. -- [ ] Document the trust model: what the supervisor controls vs what agents can do autonomously. - -## Questions to Explore -- Can we use MCP (Model Context Protocol) to expose our Rust-implemented tools to Claude Code, replacing its built-in Bash/filesystem tools with scoped versions? -- What's the right granularity for shell allowlists — command-level (`cargo test`) or pattern-level (`cargo *`)? -- Should agents have read access outside their worktree (e.g., to reference shared specs) but write access only within it? -- Is OS-level sandboxing (Docker, macOS sandbox profiles) worth the complexity for a personal tool? - -## Out of Scope -- Multi-user authentication or authorization (single-user personal tool). -- Network-level isolation between agents. -- Encrypting agent communication channels (all local).