From e71dcd8226e7c18b7b169ddf645af48a762ad2f9 Mon Sep 17 00:00:00 2001 From: Dave Date: Sat, 27 Dec 2025 19:37:01 +0000 Subject: [PATCH] Story 12: Update story and specs for Claude integration Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach --- .../specs/functional/AI_INTEGRATION.md | 127 +++++++++++++++++- .living_spec/specs/tech/STACK.md | 20 ++- .../stories/12_be_able_to_use_claude.md | 83 ++++++++++++ 3 files changed, 221 insertions(+), 9 deletions(-) diff --git a/.living_spec/specs/functional/AI_INTEGRATION.md b/.living_spec/specs/functional/AI_INTEGRATION.md index 40a8e91..10f3076 100644 --- a/.living_spec/specs/functional/AI_INTEGRATION.md +++ b/.living_spec/specs/functional/AI_INTEGRATION.md @@ -5,6 +5,14 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface * **Generation:** Sending prompt + history + tools to the model. * **Parsing:** Extracting text content vs. tool calls from the raw response. +The system supports multiple LLM providers: +* **Ollama:** Local models running via Ollama server +* **Anthropic:** Claude models via Anthropic API (Story 12) + +Provider selection is **automatic** based on model name: +* Model starts with `claude-` → Anthropic provider +* Otherwise → Ollama provider + ## 2. Ollama Implementation * **Endpoint:** `http://localhost:11434/api/chat` * **JSON Protocol:** @@ -12,7 +20,82 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface * Response: Standard Ollama JSON with `message.tool_calls`. * **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`). -## 3. Chat Loop (Backend) +## 3. Anthropic (Claude) Implementation + +### Endpoint +* **Base URL:** `https://api.anthropic.com/v1/messages` +* **Authentication:** Requires `x-api-key` header with Anthropic API key +* **API Version:** `anthropic-version: 2023-06-01` header required + +### API Protocol +* **Request Format:** + ```json + { + "model": "claude-3-5-sonnet-20241022", + "max_tokens": 4096, + "messages": [ + {"role": "user", "content": "Hello"}, + {"role": "assistant", "content": "Hi!"} + ], + "tools": [...], + "stream": true + } + ``` +* **Response Format (Streaming):** + * Server-Sent Events (SSE) + * Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop` + * Tool calls appear as `content_block` with `type: "tool_use"` + +### Tool Format Differences +Anthropic's tool format differs from Ollama/OpenAI: + +**Anthropic Tool Definition:** +```json +{ + "name": "read_file", + "description": "Reads a file", + "input_schema": { + "type": "object", + "properties": { + "path": {"type": "string"} + }, + "required": ["path"] + } +} +``` + +**Our Internal Format:** +```json +{ + "type": "function", + "function": { + "name": "read_file", + "description": "Reads a file", + "parameters": { + "type": "object", + "properties": { + "path": {"type": "string"} + }, + "required": ["path"] + } + } +} +``` + +The backend must convert between these formats. + +### Context Windows +* **claude-3-5-sonnet-20241022:** 200,000 tokens +* **claude-3-5-haiku-20241022:** 200,000 tokens + +### API Key Storage +* **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) +* **Crate:** `keyring` for cross-platform support +* **Service Name:** `living-spec-anthropic-api-key` +* **Username:** `default` +* **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user. + +## 4. Chat Loop (Backend) The `chat` command acts as the **Agent Loop**: 1. Frontend sends: `User Message`. 2. Backend appends to `SessionState.history`. @@ -24,6 +107,44 @@ The `chat` command acts as the **Agent Loop**: * Backend *re-prompts* Ollama with the new history (recursion). * Repeat until Text Response or Max Turns reached. -## 4. Frontend State -* **Settings:** Store `llm_provider` ("ollama"), `ollama_model` ("llama3.2"), `ollama_base_url`. +## 5. Model Selection UI + +### Unified Dropdown +The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider: + +```html + +``` + +### Model List Sources +* **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command +* **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models) + +### API Key Flow +1. User selects a Claude model from dropdown +2. Frontend sends chat request to backend +3. Backend detects `claude-` prefix in model name +4. Backend checks OS keychain for stored API key +5. If not found: + - Backend returns error: "Anthropic API key not found" + - Frontend shows dialog prompting for API key + - User enters key + - Frontend calls `set_anthropic_api_key` command + - Backend stores key in OS keychain + - User retries chat request +6. If found: Backend proceeds with Anthropic API request + +## 6. Frontend State +* **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1") +* **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately) * **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions). diff --git a/.living_spec/specs/tech/STACK.md b/.living_spec/specs/tech/STACK.md index f32b316..0e7bb6c 100644 --- a/.living_spec/specs/tech/STACK.md +++ b/.living_spec/specs/tech/STACK.md @@ -31,12 +31,18 @@ To support both Remote and Local models, the system implements a `ModelProvider` * Abstract the differences between API formats (OpenAI-compatible vs Anthropic vs Gemini). * Normalize "Tool Use" definitions, as each provider handles function calling schemas differently. * **Supported Providers:** - * **Anthropic:** Focus on Claude 3.5 Sonnet for coding tasks. - * **Google:** Gemini 1.5 Pro for massive context windows. * **Ollama:** Local inference (e.g., Llama 3, DeepSeek Coder) for privacy and offline usage. -* **Configuration:** - * Provider selection is runtime-configurable by the user. - * API Keys must be stored securely (using OS native keychain where possible). + * **Anthropic:** Claude 3.5 models (Sonnet, Haiku) via API for coding tasks (Story 12). +* **Provider Selection:** + * Automatic detection based on model name prefix: + * `claude-` → Anthropic API + * Otherwise → Ollama + * Single unified model dropdown with section headers ("Anthropic", "Ollama") +* **API Key Management:** + * Anthropic API key stored in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) + * Uses `keyring` crate for cross-platform secure storage + * On first use of Claude model, user prompted to enter API key + * Key persists across sessions (no re-entry needed) ## Tooling Capabilities @@ -90,7 +96,9 @@ To support both Remote and Local models, the system implements a `ModelProvider` * `ignore`: Fast recursive directory iteration respecting gitignore. * `walkdir`: Simple directory traversal. * `tokio`: Async runtime. - * `reqwest`: For LLM API calls (if backend-initiated). + * `reqwest`: For LLM API calls (Anthropic, Ollama). + * `eventsource-stream`: For Server-Sent Events (Anthropic streaming). + * `keyring`: Secure API key storage in OS keychain. * `uuid`: For unique message IDs. * `chrono`: For timestamps. * `tauri-plugin-dialog`: Native system dialogs. diff --git a/.living_spec/stories/12_be_able_to_use_claude.md b/.living_spec/stories/12_be_able_to_use_claude.md index e69de29..6e090ce 100644 --- a/.living_spec/stories/12_be_able_to_use_claude.md +++ b/.living_spec/stories/12_be_able_to_use_claude.md @@ -0,0 +1,83 @@ +# Story 12: Be Able to Use Claude + +## User Story +As a user, I want to be able to select Claude (via Anthropic API) as my LLM provider so I can use Claude models instead of only local Ollama models. + +## Acceptance Criteria +- [ ] Claude models appear in the unified model dropdown (same dropdown as Ollama models) +- [ ] Dropdown is organized with section headers: "Anthropic" and "Ollama" with models listed under each +- [ ] When user first selects a Claude model, a dialog prompts for Anthropic API key +- [ ] API key is stored securely in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) +- [ ] Provider is auto-detected from model name (starts with `claude-` = Anthropic, otherwise = Ollama) +- [ ] Chat requests route to Anthropic API when Claude model is selected +- [ ] Streaming responses work with Claude (token-by-token display) +- [ ] Tool calling works with Claude (using Anthropic's tool format) +- [ ] Context window calculation accounts for Claude models (200k tokens) +- [ ] User's model selection persists between sessions +- [ ] Clear error messages if API key is missing or invalid + +## Out of Scope +- Support for other providers (OpenAI, Google, etc.) - can be added later +- API key management UI (rotation, multiple keys, view/edit key after initial entry) +- Cost tracking or usage monitoring +- Model fine-tuning or custom models +- Switching models mid-conversation (user can start new session) +- Fetching available Claude models from API (hardcoded list is fine) + +## Technical Notes +- Anthropic API endpoint: `https://api.anthropic.com/v1/messages` +- API key should be stored securely (environment variable or secure storage) +- Claude models support tool use (function calling) +- Context windows: claude-3-5-sonnet (200k), claude-3-5-haiku (200k) +- Streaming uses Server-Sent Events (SSE) +- Tool format differs from OpenAI/Ollama - needs conversion + +## Design Considerations +- Single unified model dropdown with section headers ("Anthropic", "Ollama") +- Use `` in HTML select for visual grouping +- API key dialog appears on-demand (first use of Claude model) +- Store API key in OS keychain using `keyring` crate (cross-platform) +- Backend auto-detects provider from model name pattern +- Handle API key in backend only (don't expose to frontend logs) +- Alphabetical sorting within each provider section + +## Implementation Approach + +### Backend (Rust) +1. Add `anthropic` feature/module for Claude API client +2. Create `AnthropicClient` with streaming support +3. Convert tool definitions to Anthropic format +4. Handle Anthropic streaming response format +5. Add API key storage (encrypted or environment variable) + +### Frontend (TypeScript) +1. Add hardcoded list of Claude models (claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022) +2. Merge Ollama and Claude models into single dropdown with `` sections +3. Create API key input dialog/modal component +4. Trigger API key dialog when Claude model selected and no key stored +5. Add Tauri command to check if API key exists in keychain +6. Add Tauri command to set API key in keychain +7. Update context window calculations for Claude models (200k tokens) + +### API Differences +- Anthropic uses `messages` array format (similar to OpenAI) +- Tools are called `tools` with different schema +- Streaming events have different structure +- Need to map our tool format to Anthropic's format + +## Security Considerations +- API key stored in OS keychain (not in files or environment variables) +- Use `keyring` crate for cross-platform secure storage +- Never log API key in console or files +- Backend validates API key format before making requests +- Handle API errors gracefully (rate limits, invalid key, network errors) +- API key only accessible to the app process + +## UI Flow +1. User opens model dropdown → sees "Anthropic" section with Claude models, "Ollama" section with local models +2. User selects `claude-3-5-sonnet-20241022` +3. Backend checks keychain for stored API key +4. If not found → Frontend shows dialog: "Enter your Anthropic API key" +5. User enters key → Backend stores in OS keychain +6. Chat proceeds with Anthropic API +7. Future sessions: API key auto-loaded from keychain (no prompt) \ No newline at end of file