From dbc884968162b33cfade1d7e728969fa1e1dc3d3 Mon Sep 17 00:00:00 2001 From: Dave Date: Fri, 20 Mar 2026 07:32:12 +0000 Subject: [PATCH] story-kit: create 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting --- ...r_agent_isolation_and_resource_limiting.md | 50 +++++++++++++------ 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md b/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md index 8b6efb6..642076a 100644 --- a/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md +++ b/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md @@ -6,28 +6,46 @@ name: "Evaluate Docker/OrbStack for agent isolation and resource limiting" ## Question -Investigate using Docker (or OrbStack as a faster macOS alternative) to isolate agent processes from the host. Currently agents run as bare Claude Code processes on the host with full filesystem and network access. Docker could provide: +Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other. -1. **Filesystem isolation** — agents only see their worktree, not the host filesystem -2. **Network isolation** — agents can't talk to Matrix, SSH, or external services unless explicitly allowed -3. **Resource limits** — cap CPU and memory per agent to prevent load average spikes (currently hitting 27) -4. **Clean environments** — each agent gets a fresh container with just the toolchain -5. **Kill switch** — docker kill is cleaner than tracking PTY child processes +Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide: + +1. **Host isolation** — storkit can't touch anything outside the container +2. **Clean install/uninstall** — `docker run` to start, `docker rm` to remove +3. **Reproducible environment** — same container works on any machine +4. **Distributable product** — `docker pull storkit` for new users +5. **Resource limits** — cap total CPU/memory for the whole system + +## Architecture + +``` +Docker Container (single) +├── storkit server +│ ├── Matrix bot +│ ├── WhatsApp webhook +│ ├── Slack webhook +│ ├── Web UI +│ └── MCP server +├── Agent processes (coder-1, coder-2, coder-opus, qa, mergemaster) +├── Rust toolchain + Node.js + Claude Code CLI +└── /workspace (bind-mounted project repo from host) +``` ## Key questions to answer: -- **Performance**: How much slower are cargo builds in a Docker bind-mounted volume on macOS vs native? Compare Docker Desktop vs OrbStack. -- **Dockerfile**: What's the minimal image? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest. -- **MCP connectivity**: Can containerized agents connect to the host's MCP server via host.docker.internal? -- **Git**: Should the container handle git operations, or should the server manage all git and just bind-mount the worktree? -- **API key**: Pass ANTHROPIC_API_KEY as env var — any security concerns? -- **Agent spawning**: What changes in pool.rs to spawn `docker run` instead of a PTY? -- **Output streaming**: Can we get real-time agent output from docker logs -f, or do we need a different approach? -- **Cargo cache**: Sharing ~/.cargo/registry across containers to avoid cold-start dependency downloads? -- **OrbStack**: Is it worth requiring OrbStack for Mac users, or should Docker Desktop also be supported? +- **Performance**: How much slower are cargo builds inside the container on macOS? Compare Docker Desktop vs OrbStack for bind-mounted volumes. +- **Dockerfile**: What's the minimal image for the full stack? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest + git. +- **Bind mounts**: The project repo is bind-mounted from the host. Any filesystem performance concerns with OrbStack? +- **Networking**: Container exposes web UI port (3000). Matrix/WhatsApp/Slack connect outbound. Any issues? +- **API key**: Pass ANTHROPIC_API_KEY as env var to the container. +- **Git**: Git operations happen inside the container on the bind-mounted repo. Commits are visible on the host immediately. +- **Cargo cache**: Use a named Docker volume for ~/.cargo/registry so dependencies persist across container restarts. +- **Claude Code state**: Where does Claude Code store its session data? Needs to persist or be in a volume. +- **OrbStack vs Docker Desktop**: Is OrbStack required for acceptable performance, or does Docker Desktop work too? +- **Server restart**: Does `rebuild_and_restart` work inside a container (re-exec with new binary)? ## Deliverable: -A short write-up with findings, a proof-of-concept Dockerfile, and a recommendation on whether to proceed with a full implementation story. +A proof-of-concept Dockerfile, docker-compose.yml, and a short write-up with findings and performance benchmarks. ## Hypothesis