Story 12: Update story and specs for Claude integration

Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach
2025-12-27 19:37:01 +00:00
parent ca7efc2888
commit e71dcd8226
3 changed files with 221 additions and 9 deletions
--- a/.living_spec/specs/functional/AI_INTEGRATION.md
+++ b/.living_spec/specs/functional/AI_INTEGRATION.md
@@ -5,6 +5,14 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
 *   **Generation:** Sending prompt + history + tools to the model.
 *   **Parsing:** Extracting text content vs. tool calls from the raw response.
 The system supports multiple LLM providers:
 *   **Ollama:** Local models running via Ollama server
 *   **Anthropic:** Claude models via Anthropic API (Story 12)
 Provider selection is **automatic** based on model name:
 *   Model starts with `claude-` → Anthropic provider
 *   Otherwise → Ollama provider
 ## 2. Ollama Implementation
 *   **Endpoint:** `http://localhost:11434/api/chat`
 *   **JSON Protocol:**
@@ -12,7 +20,82 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
    *   Response: Standard Ollama JSON with `message.tool_calls`.
 *   **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).
-## 3. Chat Loop (Backend)
+## 3. Anthropic (Claude) Implementation
 ### Endpoint
 *   **Base URL:** `https://api.anthropic.com/v1/messages`
 *   **Authentication:** Requires `x-api-key` header with Anthropic API key
 *   **API Version:** `anthropic-version: 2023-06-01` header required
 ### API Protocol
 *   **Request Format:**
    ```json
    {
      "model": "claude-3-5-sonnet-20241022",
      "max_tokens": 4096,
      "messages": [
        {"role": "user", "content": "Hello"},
        {"role": "assistant", "content": "Hi!"}
      ],
      "tools": [...],
      "stream": true
    }
    ```
 *   **Response Format (Streaming):**
    *   Server-Sent Events (SSE)
    *   Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
    *   Tool calls appear as `content_block` with `type: "tool_use"`
 ### Tool Format Differences
 Anthropic's tool format differs from Ollama/OpenAI:
 **Anthropic Tool Definition:**
 ```json
 {
  "name": "read_file",
  "description": "Reads a file",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": {"type": "string"}
    },
    "required": ["path"]
  }
 }
 ```
 **Our Internal Format:**
 ```json
 {
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Reads a file",
    "parameters": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      },
      "required": ["path"]
    }
  }
 }
 ```
 The backend must convert between these formats.
 ### Context Windows
 *   **claude-3-5-sonnet-20241022:** 200,000 tokens
 *   **claude-3-5-haiku-20241022:** 200,000 tokens
 ### API Key Storage
 *   **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
 *   **Crate:** `keyring` for cross-platform support
 *   **Service Name:** `living-spec-anthropic-api-key`
 *   **Username:** `default`
 *   **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user.
 ## 4. Chat Loop (Backend)
 The `chat` command acts as the **Agent Loop**:
 1.  Frontend sends: `User Message`.
 2.  Backend appends to `SessionState.history`.
@@ -24,6 +107,44 @@ The `chat` command acts as the **Agent Loop**:
    *   Backend *re-prompts* Ollama with the new history (recursion).
    *   Repeat until Text Response or Max Turns reached.
-## 4. Frontend State
+## 5. Model Selection UI
-*   **Settings:** Store `llm_provider` ("ollama"), `ollama_model` ("llama3.2"), `ollama_base_url`.
+
 ### Unified Dropdown
 The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:
 ```html
 <select>
  <optgroup label="Anthropic">
    <option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
    <option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
  </optgroup>
  <optgroup label="Ollama">
    <option value="deepseek-r1:70b">deepseek-r1:70b</option>
    <option value="llama3.1">llama3.1</option>
    <option value="qwen2.5">qwen2.5</option>
  </optgroup>
 </select>
 ```
 ### Model List Sources
 *   **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
 *   **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models)
 ### API Key Flow
 1. User selects a Claude model from dropdown
 2. Frontend sends chat request to backend
 3. Backend detects `claude-` prefix in model name
 4. Backend checks OS keychain for stored API key
 5. If not found:
   - Backend returns error: "Anthropic API key not found"
   - Frontend shows dialog prompting for API key
   - User enters key
   - Frontend calls `set_anthropic_api_key` command
   - Backend stores key in OS keychain
   - User retries chat request
 6. If found: Backend proceeds with Anthropic API request
 ## 6. Frontend State
 *   **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
 *   **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately)
 *   **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).
--- a/.living_spec/specs/tech/STACK.md
+++ b/.living_spec/specs/tech/STACK.md
@@ -31,12 +31,18 @@ To support both Remote and Local models, the system implements a `ModelProvider`
    *   Abstract the differences between API formats (OpenAI-compatible vs Anthropic vs Gemini).
    *   Normalize "Tool Use" definitions, as each provider handles function calling schemas differently.
 *   **Supported Providers:**
    *   **Anthropic:** Focus on Claude 3.5 Sonnet for coding tasks.
    *   **Google:** Gemini 1.5 Pro for massive context windows.
    *   **Ollama:** Local inference (e.g., Llama 3, DeepSeek Coder) for privacy and offline usage.
-*   **Configuration:**
+    *   **Anthropic:** Claude 3.5 models (Sonnet, Haiku) via API for coding tasks (Story 12).
-    *   Provider selection is runtime-configurable by the user.
+*   **Provider Selection:**
-    *   API Keys must be stored securely (using OS native keychain where possible).
+    *   Automatic detection based on model name prefix:
        *   `claude-` → Anthropic API
        *   Otherwise → Ollama
    *   Single unified model dropdown with section headers ("Anthropic", "Ollama")
 *   **API Key Management:**
    *   Anthropic API key stored in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
    *   Uses `keyring` crate for cross-platform secure storage
    *   On first use of Claude model, user prompted to enter API key
    *   Key persists across sessions (no re-entry needed)
 ## Tooling Capabilities
@@ -90,7 +96,9 @@ To support both Remote and Local models, the system implements a `ModelProvider`
    *   `ignore`: Fast recursive directory iteration respecting gitignore.
    *   `walkdir`: Simple directory traversal.
    *   `tokio`: Async runtime.
-    *   `reqwest`: For LLM API calls (if backend-initiated).
+    *   `reqwest`: For LLM API calls (Anthropic, Ollama).
    *   `eventsource-stream`: For Server-Sent Events (Anthropic streaming).
    *   `keyring`: Secure API key storage in OS keychain.
    *   `uuid`: For unique message IDs.
    *   `chrono`: For timestamps.
    *   `tauri-plugin-dialog`: Native system dialogs.
--- a/.living_spec/stories/12_be_able_to_use_claude.md
+++ b/.living_spec/stories/12_be_able_to_use_claude.md
@@ -0,0 +1,83 @@
 # Story 12: Be Able to Use Claude
 ## User Story
 As a user, I want to be able to select Claude (via Anthropic API) as my LLM provider so I can use Claude models instead of only local Ollama models.
 ## Acceptance Criteria
 - [ ] Claude models appear in the unified model dropdown (same dropdown as Ollama models)
 - [ ] Dropdown is organized with section headers: "Anthropic" and "Ollama" with models listed under each
 - [ ] When user first selects a Claude model, a dialog prompts for Anthropic API key
 - [ ] API key is stored securely in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
 - [ ] Provider is auto-detected from model name (starts with `claude-` = Anthropic, otherwise = Ollama)
 - [ ] Chat requests route to Anthropic API when Claude model is selected
 - [ ] Streaming responses work with Claude (token-by-token display)
 - [ ] Tool calling works with Claude (using Anthropic's tool format)
 - [ ] Context window calculation accounts for Claude models (200k tokens)
 - [ ] User's model selection persists between sessions
 - [ ] Clear error messages if API key is missing or invalid
 ## Out of Scope
 - Support for other providers (OpenAI, Google, etc.) - can be added later
 - API key management UI (rotation, multiple keys, view/edit key after initial entry)
 - Cost tracking or usage monitoring
 - Model fine-tuning or custom models
 - Switching models mid-conversation (user can start new session)
 - Fetching available Claude models from API (hardcoded list is fine)
 ## Technical Notes
 - Anthropic API endpoint: `https://api.anthropic.com/v1/messages`
 - API key should be stored securely (environment variable or secure storage)
 - Claude models support tool use (function calling)
 - Context windows: claude-3-5-sonnet (200k), claude-3-5-haiku (200k)
 - Streaming uses Server-Sent Events (SSE)
 - Tool format differs from OpenAI/Ollama - needs conversion
 ## Design Considerations
 - Single unified model dropdown with section headers ("Anthropic", "Ollama")
 - Use `<optgroup>` in HTML select for visual grouping
 - API key dialog appears on-demand (first use of Claude model)
 - Store API key in OS keychain using `keyring` crate (cross-platform)
 - Backend auto-detects provider from model name pattern
 - Handle API key in backend only (don't expose to frontend logs)
 - Alphabetical sorting within each provider section
 ## Implementation Approach
 ### Backend (Rust)
 1. Add `anthropic` feature/module for Claude API client
 2. Create `AnthropicClient` with streaming support
 3. Convert tool definitions to Anthropic format
 4. Handle Anthropic streaming response format
 5. Add API key storage (encrypted or environment variable)
 ### Frontend (TypeScript)
 1. Add hardcoded list of Claude models (claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022)
 2. Merge Ollama and Claude models into single dropdown with `<optgroup>` sections
 3. Create API key input dialog/modal component
 4. Trigger API key dialog when Claude model selected and no key stored
 5. Add Tauri command to check if API key exists in keychain
 6. Add Tauri command to set API key in keychain
 7. Update context window calculations for Claude models (200k tokens)
 ### API Differences
 - Anthropic uses `messages` array format (similar to OpenAI)
 - Tools are called `tools` with different schema
 - Streaming events have different structure
 - Need to map our tool format to Anthropic's format
 ## Security Considerations
 - API key stored in OS keychain (not in files or environment variables)
 - Use `keyring` crate for cross-platform secure storage
 - Never log API key in console or files
 - Backend validates API key format before making requests
 - Handle API errors gracefully (rate limits, invalid key, network errors)
 - API key only accessible to the app process
 ## UI Flow
 1. User opens model dropdown → sees "Anthropic" section with Claude models, "Ollama" section with local models
 2. User selects `claude-3-5-sonnet-20241022`
 3. Backend checks keychain for stored API key
 4. If not found → Frontend shows dialog: "Enter your Anthropic API key"
 5. User enters key → Backend stores in OS keychain
 6. Chat proceeds with Anthropic API
 7. Future sessions: API key auto-loaded from keychain (no prompt)