From dbc884968162b33cfade1d7e728969fa1e1dc3d3 Mon Sep 17 00:00:00 2001
From: Dave <futurechimp@users.noreply.github.com>
Date: Fri, 20 Mar 2026 07:32:12 +0000
Subject: [PATCH] story-kit: create
 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting

---
 ...r_agent_isolation_and_resource_limiting.md | 50 +++++++++++++------
 1 file changed, 34 insertions(+), 16 deletions(-)

diff --git a/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md b/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
index 8b6efb6..642076a 100644
--- a/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
+++ b/.story_kit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md
@@ -6,28 +6,46 @@ name: "Evaluate Docker/OrbStack for agent isolation and resource limiting"
 
 ## Question
 
-Investigate using Docker (or OrbStack as a faster macOS alternative) to isolate agent processes from the host. Currently agents run as bare Claude Code processes on the host with full filesystem and network access. Docker could provide:
+Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other.
 
-1. **Filesystem isolation** — agents only see their worktree, not the host filesystem
-2. **Network isolation** — agents can't talk to Matrix, SSH, or external services unless explicitly allowed
-3. **Resource limits** — cap CPU and memory per agent to prevent load average spikes (currently hitting 27)
-4. **Clean environments** — each agent gets a fresh container with just the toolchain
-5. **Kill switch** — docker kill is cleaner than tracking PTY child processes
+Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide:
+
+1. **Host isolation** — storkit can't touch anything outside the container
+2. **Clean install/uninstall** — `docker run` to start, `docker rm` to remove
+3. **Reproducible environment** — same container works on any machine
+4. **Distributable product** — `docker pull storkit` for new users
+5. **Resource limits** — cap total CPU/memory for the whole system
+
+## Architecture
+
+```
+Docker Container (single)
+├── storkit server
+│   ├── Matrix bot
+│   ├── WhatsApp webhook
+│   ├── Slack webhook
+│   ├── Web UI
+│   └── MCP server
+├── Agent processes (coder-1, coder-2, coder-opus, qa, mergemaster)
+├── Rust toolchain + Node.js + Claude Code CLI
+└── /workspace (bind-mounted project repo from host)
+```
 
 ## Key questions to answer:
 
-- **Performance**: How much slower are cargo builds in a Docker bind-mounted volume on macOS vs native? Compare Docker Desktop vs OrbStack.
-- **Dockerfile**: What's the minimal image? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest.
-- **MCP connectivity**: Can containerized agents connect to the host's MCP server via host.docker.internal?
-- **Git**: Should the container handle git operations, or should the server manage all git and just bind-mount the worktree?
-- **API key**: Pass ANTHROPIC_API_KEY as env var — any security concerns?
-- **Agent spawning**: What changes in pool.rs to spawn `docker run` instead of a PTY?
-- **Output streaming**: Can we get real-time agent output from docker logs -f, or do we need a different approach?
-- **Cargo cache**: Sharing ~/.cargo/registry across containers to avoid cold-start dependency downloads?
-- **OrbStack**: Is it worth requiring OrbStack for Mac users, or should Docker Desktop also be supported?
+- **Performance**: How much slower are cargo builds inside the container on macOS? Compare Docker Desktop vs OrbStack for bind-mounted volumes.
+- **Dockerfile**: What's the minimal image for the full stack? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest + git.
+- **Bind mounts**: The project repo is bind-mounted from the host. Any filesystem performance concerns with OrbStack?
+- **Networking**: Container exposes web UI port (3000). Matrix/WhatsApp/Slack connect outbound. Any issues?
+- **API key**: Pass ANTHROPIC_API_KEY as env var to the container.
+- **Git**: Git operations happen inside the container on the bind-mounted repo. Commits are visible on the host immediately.
+- **Cargo cache**: Use a named Docker volume for ~/.cargo/registry so dependencies persist across container restarts.
+- **Claude Code state**: Where does Claude Code store its session data? Needs to persist or be in a volume.
+- **OrbStack vs Docker Desktop**: Is OrbStack required for acceptable performance, or does Docker Desktop work too?
+- **Server restart**: Does `rebuild_and_restart` work inside a container (re-exec with new binary)?
 
 ## Deliverable:
-A short write-up with findings, a proof-of-concept Dockerfile, and a recommendation on whether to proceed with a full implementation story.
+A proof-of-concept Dockerfile, docker-compose.yml, and a short write-up with findings and performance benchmarks.
 
 ## Hypothesis