diff --git a/.storkit/work/1_backlog/402_bug_whatsapp_and_slack_missing_rebuild_command_handler.md b/.storkit/work/1_backlog/402_bug_whatsapp_and_slack_missing_rebuild_command_handler.md deleted file mode 100644 index 1681da79..00000000 --- a/.storkit/work/1_backlog/402_bug_whatsapp_and_slack_missing_rebuild_command_handler.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -name: "WhatsApp and Slack missing rebuild command handler" -blocked: true ---- - -# Bug 402: WhatsApp and Slack missing rebuild command handler - -## Description - -The rebuild command has a fallback handler in chat/commands/mod.rs that returns None. Only Matrix has pre-dispatch handling for this command. On WhatsApp and Slack, the command falls through to the LLM path. - -## How to Reproduce - -1. Configure bot with transport = "whatsapp" or "slack"\n2. Send "rebuild" to the bot\n3. Check server logs - -## Actual Result - -Command falls through to LLM instead of triggering a server rebuild. - -## Expected Result - -The bot triggers a server rebuild and replies with confirmation. - -## Acceptance Criteria - -- [ ] WhatsApp transport handles rebuild command: triggers rebuild and replies with confirmation -- [ ] Slack transport handles rebuild command: triggers rebuild and replies with confirmation diff --git a/.storkit/work/1_backlog/407_spike_fly_io_machines_for_multi_tenant_storkit_saas.md b/.storkit/work/1_backlog/407_spike_fly_io_machines_for_multi_tenant_storkit_saas.md index cd47c114..21c9d4d8 100644 --- a/.storkit/work/1_backlog/407_spike_fly_io_machines_for_multi_tenant_storkit_saas.md +++ b/.storkit/work/1_backlog/407_spike_fly_io_machines_for_multi_tenant_storkit_saas.md @@ -18,14 +18,14 @@ Fly.io Machines (Firecracker-based microVMs) offer the right balance of isolatio ## Investigation Plan -- [ ] Review Fly.io Machines API — create/start/stop/destroy machine via REST, assess Rust `reqwest` integration -- [ ] Assess isolation model — Firecracker microVM vs gVisor; is it sufficient for tenants running arbitrary shell commands via claude code? -- [ ] Test cold start time for a storkit container image (target: <2s) -- [ ] Evaluate persistent volume support — can a volume be attached per tenant for `.storkit/` and project root? -- [ ] Assess Claude auth injection — how to securely pass `~/.claude/.credentials.json` per tenant at machine start -- [ ] Sketch the auth proxy design — JWT validation → machine lookup → reverse proxy (WebSocket support required) -- [ ] Check pricing model for always-on vs stop-on-idle machines at small tenant counts (10, 100, 1000) -- [ ] Identify any showstoppers (network egress limits, image registry, machine count limits per org) +- [ ] Review Fly.io Machines API docs (web search) — create/start/stop/destroy machine via REST; document the key endpoints and request shapes for a Rust `reqwest` client +- [ ] Research Fly.io's documented isolation model — what guarantees do they publish about Firecracker microVM isolation? Document what is claimed, and explicitly flag what would require independent security review before production use. Do not attempt to verify isolation empirically. +- [ ] Research cold start time — what do Fly.io docs and community benchmarks say? Flag that real numbers require a test account with a storkit image (out of scope for this spike). +- [ ] Evaluate persistent volume support — document the Fly Volumes API; can a volume be attached per-tenant for `.storkit/` and project root? +- [ ] Research Claude auth injection — what options exist for passing per-tenant secrets (e.g. `~/.claude/.credentials.json`) at machine start? (env vars, secrets API, volume mounts) +- [ ] Sketch the auth proxy design — JWT validation → machine lookup → reverse proxy; confirm WebSocket proxying is supported +- [ ] Check pricing model — document always-on vs stop-on-idle machine costs at 10, 100, 1000 tenants +- [ ] Identify any documented showstoppers (machine count limits, network egress, image registry constraints) ## Findings