# Spike: Claude Code Integration via PTY + CLI **Question:** Can we run Claude Code programmatically from our Rust backend while using Max subscription billing instead of per-token API billing? **Hypothesis:** Spawning `claude -p` inside a pseudo-terminal (PTY) will make `isatty()` return true, causing Claude Code to use Max subscription billing while giving us structured JSON output. **Timebox:** 2 hours **Result: HYPOTHESIS CONFIRMED** --- ## Proof Spawning `claude -p "hi" --output-format stream-json --verbose` inside a PTY from Rust (`portable-pty` crate) produces: ```json {"type":"system","subtype":"init","apiKeySource":"none","model":"claude-opus-4-6",...} {"type":"rate_limit_event","rate_limit_info":{"status":"allowed","rateLimitType":"five_hour",...}} {"type":"assistant","message":{"model":"claude-opus-4-6","content":[{"type":"text","text":"Hi! How can I help you today?"}],...}} {"type":"result","subtype":"success","total_cost_usd":0.0102,...} ``` Key evidence: - **`apiKeySource: "none"`** — not using an API key - **`rateLimitType: "five_hour"`** — Max subscription rate limiting (not per-token) - **`model: "claude-opus-4-6"`** — Opus on Max plan - Clean NDJSON output, parseable from Rust - Response streamed to browser UI via WebSocket ## Architecture (Proven) ``` Browser UI → WebSocket → Rust Backend → PTY → claude -p --output-format stream-json ↑ isatty() = true → Max subscription billing ``` ## What Works 1. `portable-pty` crate spawns Claude Code in a PTY from Rust 2. `-p` flag gives single-shot non-interactive mode (no TUI) 3. `--output-format stream-json` gives clean NDJSON (no ANSI escapes) 4. PTY makes `isatty()` return true → Max billing 5. NDJSON events parsed and streamed to frontend via WebSocket 6. Session IDs returned for potential multi-turn via `--resume` ## Event Types from stream-json | Type | Purpose | Key Fields | |------|---------|------------| | `system` | Init event | `session_id`, `model`, `apiKeySource`, `tools`, `agents` | | `rate_limit_event` | Billing info | `status`, `rateLimitType` | | `assistant` | Claude's response | `message.content[].text` | | `result` | Final summary | `total_cost_usd`, `usage`, `duration_ms` | | `stream_event` | Token deltas (with `--include-partial-messages`) | `event.delta.text` | ## Multi-Agent Concurrency (Proven) Created an `AgentPool` with REST API (`POST /api/agents`, `POST /api/agents/:name/message`, `GET /api/agents`) and tested 2 concurrent coding agents: **Test:** Created `coder-1` (frontend role) and `coder-2` (backend role), sent both messages simultaneously. ``` coder-1: Listed 5 React components in 5s (session: ca3e13fc-...) coder-2: Listed 30 Rust source files in 8s (session: 8a815cf0-...) Both: apiKeySource: "none", rateLimitType: "five_hour" (Max billing) ``` **Session resumption confirmed:** Sent coder-1 a follow-up "How many components did you just list?" — it answered "5" using `--resume `. **What this proves:** - Multiple PTY sessions run concurrently without conflict - Each gets Max subscription billing independently - `--resume` gives agents multi-turn conversation memory - Supervisor pattern works: coordinator reads agent responses, sends coordinated tasks - Inter-agent communication possible via supervisor relay **Architecture for multi-agent orchestration:** - Spawn N PTY sessions, each with `claude -p` pointed at a different worktree - Rust backend coordinates work between agents - Different `--model` per agent (Opus for supervisor, Sonnet/Haiku for workers) - `--allowedTools` to restrict what each agent can do - `--max-turns` and `--max-budget-usd` for safety limits ## Key Flags for Programmatic Use ```bash claude -p "prompt" # Single-shot mode --output-format stream-json # NDJSON output --verbose # Include all events --include-partial-messages # Token-by-token streaming --model sonnet # Model selection --allowedTools "Read,Edit,Bash" # Tool permissions --permission-mode bypassPermissions # No approval prompts --resume # Continue conversation --max-turns 10 # Safety limit --max-budget-usd 5.00 # Cost cap --append-system-prompt "..." # Custom instructions --cwd /path/to/worktree # Working directory ``` ## Agent SDK Comparison The Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) is a richer TypeScript API with hooks, subagents, and MCP integration — but it **requires an API key** (per-token billing). The PTY approach is the only way to get Max subscription billing programmatically. | Factor | PTY + CLI | Agent SDK | |--------|-----------|-----------| | Billing | Max subscription | API key (per-token) | | Language | Any (subprocess) | TypeScript/Python | | Streaming | NDJSON parsing | Native async iterators | | Hooks | Not available | Callback functions | | Subagents | Multiple processes | In-process `agents` option | | Sessions | `--resume` flag | In-memory | | Complexity | Low | Medium (needs Node.js) | ## Caveats - Cost reported in `total_cost_usd` is informational, not actual billing - Concurrent PTY sessions may hit Max subscription rate limits - Each `-p` invocation is a fresh process (startup overhead ~2-3s) - PTY dependency (`portable-pty`) adds ~15 crates ## Next Steps 1. **Story:** Add `--include-partial-messages` for real-time token streaming to browser 2. **Story:** Production multi-agent orchestration with worktree isolation per agent 3. **Story:** Streaming HTTP responses (SSE) instead of blocking request until agent completes 4. **Consider:** Whether Rust backend should become a thin orchestration layer over Claude Code rather than reimplementing agent capabilities