From 65c8dc19d6d62e46026a7b923ca448ad0ce3b8ac Mon Sep 17 00:00:00 2001 From: Dave Date: Fri, 20 Mar 2026 19:05:18 +0000 Subject: [PATCH] storkit: create 329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting --- ...r_agent_isolation_and_resource_limiting.md | 2 + ...ime_to_support_non_claude_code_backends.md | 41 ------------------- 2 files changed, 2 insertions(+), 41 deletions(-) delete mode 100644 .storkit/work/1_backlog/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md diff --git a/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md b/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md index c68eb30..677baa0 100644 --- a/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md +++ b/.storkit/work/1_backlog/329_spike_evaluate_docker_orbstack_for_agent_isolation_and_resource_limiting.md @@ -9,6 +9,8 @@ agent: coder-opus Investigate running the entire storkit system (server, Matrix bot, agents, web UI) inside a single Docker container, using OrbStack as the macOS runtime for better performance. The goal is to isolate storkit from the host machine — not to isolate agents from each other. +**Important context:** Storkit developing itself is the dogfood edge case. The primary use case is storkit managing agents that develop *other* projects, driven by multiple users in chat rooms (Matrix, WhatsApp, Slack). Isolation must account for untrusted codebases, multi-user command surfaces, and running against arbitrary repos — not just the single-developer self-hosted setup. + Currently storkit runs as bare processes on the host with full filesystem and network access. A single container would provide: 1. **Host isolation** — storkit can't touch anything outside the container diff --git a/.storkit/work/1_backlog/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md b/.storkit/work/1_backlog/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md deleted file mode 100644 index 6a7c7c1..0000000 --- a/.storkit/work/1_backlog/343_refactor_abstract_agent_runtime_to_support_non_claude_code_backends.md +++ /dev/null @@ -1,41 +0,0 @@ ---- -name: "Abstract agent runtime to support non-Claude-Code backends" -agent: coder-opus ---- - -# Refactor 343: Abstract agent runtime to support non-Claude-Code backends - -## Current State - -- TBD - -## Desired State - -Currently agent spawning is tightly coupled to Claude Code CLI — agents are spawned as PTY processes running the `claude` binary. To support ChatGPT and Gemini as agent backends, we need to abstract the agent runtime. - -The agent pool currently does: -1. Spawn `claude` CLI process via portable-pty -2. Stream JSON events from stdout -3. Parse tool calls, text output, thinking traces -4. Wait for process exit, run gates - -This needs to become a trait so different backends can be plugged in: -- Claude Code (existing) — spawns `claude` CLI, parses JSON stream -- OpenAI API — calls ChatGPT via API with tool definitions, manages conversation loop -- Gemini API — calls Gemini via API with tool definitions, manages conversation loop - -The key abstraction is: an agent runtime takes a prompt + tools and produces a stream of events (text output, tool calls, completion). The existing PTY/Claude Code logic becomes one implementation of this trait. - -## Acceptance Criteria - -- [ ] Define an AgentRuntime trait with methods for: start, stream_events, stop, get_status -- [ ] ClaudeCodeRuntime implements the trait using existing PTY spawning logic -- [ ] Agent pool uses the trait instead of directly spawning Claude Code -- [ ] Runtime selection is configurable per agent in project.toml (e.g. runtime = 'claude-code') -- [ ] All existing Claude Code agent functionality preserved -- [ ] Event stream format is runtime-agnostic (text, tool_call, thinking, done) -- [ ] Token usage tracking works across runtimes - -## Out of Scope - -- TBD