Story 12: Update story and specs for Claude integration

Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach
2025-12-27 19:37:01 +00:00
parent ca7efc2888
commit e71dcd8226
3 changed files with 221 additions and 9 deletions
--- a/.living_spec/specs/functional/AI_INTEGRATION.md
+++ b/.living_spec/specs/functional/AI_INTEGRATION.md
@@ -5,6 +5,14 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
 *   **Generation:** Sending prompt + history + tools to the model.
 *   **Parsing:** Extracting text content vs. tool calls from the raw response.

+The system supports multiple LLM providers:
+*   **Ollama:** Local models running via Ollama server
+*   **Anthropic:** Claude models via Anthropic API (Story 12)
+
+Provider selection is **automatic** based on model name:
+*   Model starts with `claude-` → Anthropic provider
+*   Otherwise → Ollama provider
+
 ## 2. Ollama Implementation
 *   **Endpoint:** `http://localhost:11434/api/chat`
 *   **JSON Protocol:**
@@ -12,7 +20,82 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
    *   Response: Standard Ollama JSON with `message.tool_calls`.
 *   **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).

-## 3. Chat Loop (Backend)
+## 3. Anthropic (Claude) Implementation
+
+### Endpoint
+*   **Base URL:** `https://api.anthropic.com/v1/messages`
+*   **Authentication:** Requires `x-api-key` header with Anthropic API key
+*   **API Version:** `anthropic-version: 2023-06-01` header required
+
+### API Protocol
+*   **Request Format:**
+    ```json
+    {
+      "model": "claude-3-5-sonnet-20241022",
+      "max_tokens": 4096,
+      "messages": [
+        {"role": "user", "content": "Hello"},
+        {"role": "assistant", "content": "Hi!"}
+      ],
+      "tools": [...],
+      "stream": true
+    }
+    ```
+*   **Response Format (Streaming):**
+    *   Server-Sent Events (SSE)
+    *   Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
+    *   Tool calls appear as `content_block` with `type: "tool_use"`
+
+### Tool Format Differences
+Anthropic's tool format differs from Ollama/OpenAI:
+
+**Anthropic Tool Definition:**
+```json
+{
+  "name": "read_file",
+  "description": "Reads a file",
+  "input_schema": {
+    "type": "object",
+    "properties": {
+      "path": {"type": "string"}
+    },
+    "required": ["path"]
+  }
+}
+```
+
+**Our Internal Format:**
+```json
+{
+  "type": "function",
+  "function": {
+    "name": "read_file",
+    "description": "Reads a file",
+    "parameters": {
+      "type": "object",
+      "properties": {
+        "path": {"type": "string"}
+      },
+      "required": ["path"]
+    }
+  }
+}
+```
+
+The backend must convert between these formats.
+
+### Context Windows
+*   **claude-3-5-sonnet-20241022:** 200,000 tokens
+*   **claude-3-5-haiku-20241022:** 200,000 tokens
+
+### API Key Storage
+*   **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
+*   **Crate:** `keyring` for cross-platform support
+*   **Service Name:** `living-spec-anthropic-api-key`
+*   **Username:** `default`
+*   **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user.
+
+## 4. Chat Loop (Backend)
 The `chat` command acts as the **Agent Loop**:
 1.  Frontend sends: `User Message`.
 2.  Backend appends to `SessionState.history`.
@@ -24,6 +107,44 @@ The `chat` command acts as the **Agent Loop**:
    *   Backend *re-prompts* Ollama with the new history (recursion).
    *   Repeat until Text Response or Max Turns reached.

-## 4. Frontend State
-*   **Settings:** Store `llm_provider` ("ollama"), `ollama_model` ("llama3.2"), `ollama_base_url`.
+## 5. Model Selection UI
+
+### Unified Dropdown
+The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:
+
+```html
+<select>
+  <optgroup label="Anthropic">
+    <option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
+    <option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
+  </optgroup>
+  <optgroup label="Ollama">
+    <option value="deepseek-r1:70b">deepseek-r1:70b</option>
+    <option value="llama3.1">llama3.1</option>
+    <option value="qwen2.5">qwen2.5</option>
+  </optgroup>
+</select>
+```
+
+### Model List Sources
+*   **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
+*   **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models)
+
+### API Key Flow
+1. User selects a Claude model from dropdown
+2. Frontend sends chat request to backend
+3. Backend detects `claude-` prefix in model name
+4. Backend checks OS keychain for stored API key
+5. If not found:
+   - Backend returns error: "Anthropic API key not found"
+   - Frontend shows dialog prompting for API key
+   - User enters key
+   - Frontend calls `set_anthropic_api_key` command
+   - Backend stores key in OS keychain
+   - User retries chat request
+6. If found: Backend proceeds with Anthropic API request
+
+## 6. Frontend State
+*   **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
+*   **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately)
 *   **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).