story-kit: create 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting
This commit is contained in:
@@ -6,28 +6,46 @@ name: "Evaluate Docker/OrbStack for agent isolation and resource limiting"
|
|||||||
|
|
||||||
## Question
|
## Question
|
||||||
|
|
||||||
Investigate using Docker (or OrbStack as a faster macOS alternative) to isolate agent processes from the host. Currently agents run as bare Claude Code processes on the host with full filesystem and network access. Docker could provide:
|
Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other.
|
||||||
|
|
||||||
1. **Filesystem isolation** — agents only see their worktree, not the host filesystem
|
Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide:
|
||||||
2. **Network isolation** — agents can't talk to Matrix, SSH, or external services unless explicitly allowed
|
|
||||||
3. **Resource limits** — cap CPU and memory per agent to prevent load average spikes (currently hitting 27)
|
1. **Host isolation** — storkit can't touch anything outside the container
|
||||||
4. **Clean environments** — each agent gets a fresh container with just the toolchain
|
2. **Clean install/uninstall** — `docker run` to start, `docker rm` to remove
|
||||||
5. **Kill switch** — docker kill is cleaner than tracking PTY child processes
|
3. **Reproducible environment** — same container works on any machine
|
||||||
|
4. **Distributable product** — `docker pull storkit` for new users
|
||||||
|
5. **Resource limits** — cap total CPU/memory for the whole system
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Docker Container (single)
|
||||||
|
├── storkit server
|
||||||
|
│ ├── Matrix bot
|
||||||
|
│ ├── WhatsApp webhook
|
||||||
|
│ ├── Slack webhook
|
||||||
|
│ ├── Web UI
|
||||||
|
│ └── MCP server
|
||||||
|
├── Agent processes (coder-1, coder-2, coder-opus, qa, mergemaster)
|
||||||
|
├── Rust toolchain + Node.js + Claude Code CLI
|
||||||
|
└── /workspace (bind-mounted project repo from host)
|
||||||
|
```
|
||||||
|
|
||||||
## Key questions to answer:
|
## Key questions to answer:
|
||||||
|
|
||||||
- **Performance**: How much slower are cargo builds in a Docker bind-mounted volume on macOS vs native? Compare Docker Desktop vs OrbStack.
|
- **Performance**: How much slower are cargo builds inside the container on macOS? Compare Docker Desktop vs OrbStack for bind-mounted volumes.
|
||||||
- **Dockerfile**: What's the minimal image? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest.
|
- **Dockerfile**: What's the minimal image for the full stack? Rust toolchain + Node.js + Claude Code CLI + cargo-nextest + git.
|
||||||
- **MCP connectivity**: Can containerized agents connect to the host's MCP server via host.docker.internal?
|
- **Bind mounts**: The project repo is bind-mounted from the host. Any filesystem performance concerns with OrbStack?
|
||||||
- **Git**: Should the container handle git operations, or should the server manage all git and just bind-mount the worktree?
|
- **Networking**: Container exposes web UI port (3000). Matrix/WhatsApp/Slack connect outbound. Any issues?
|
||||||
- **API key**: Pass ANTHROPIC_API_KEY as env var — any security concerns?
|
- **API key**: Pass ANTHROPIC_API_KEY as env var to the container.
|
||||||
- **Agent spawning**: What changes in pool.rs to spawn `docker run` instead of a PTY?
|
- **Git**: Git operations happen inside the container on the bind-mounted repo. Commits are visible on the host immediately.
|
||||||
- **Output streaming**: Can we get real-time agent output from docker logs -f, or do we need a different approach?
|
- **Cargo cache**: Use a named Docker volume for ~/.cargo/registry so dependencies persist across container restarts.
|
||||||
- **Cargo cache**: Sharing ~/.cargo/registry across containers to avoid cold-start dependency downloads?
|
- **Claude Code state**: Where does Claude Code store its session data? Needs to persist or be in a volume.
|
||||||
- **OrbStack**: Is it worth requiring OrbStack for Mac users, or should Docker Desktop also be supported?
|
- **OrbStack vs Docker Desktop**: Is OrbStack required for acceptable performance, or does Docker Desktop work too?
|
||||||
|
- **Server restart**: Does `rebuild_and_restart` work inside a container (re-exec with new binary)?
|
||||||
|
|
||||||
## Deliverable:
|
## Deliverable:
|
||||||
A short write-up with findings, a proof-of-concept Dockerfile, and a recommendation on whether to proceed with a full implementation story.
|
A proof-of-concept Dockerfile, docker-compose.yml, and a short write-up with findings and performance benchmarks.
|
||||||
|
|
||||||
## Hypothesis
|
## Hypothesis
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user