Story 12: Update story and specs for Claude integration
Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach
This commit is contained in:
@@ -5,6 +5,14 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
|
|||||||
* **Generation:** Sending prompt + history + tools to the model.
|
* **Generation:** Sending prompt + history + tools to the model.
|
||||||
* **Parsing:** Extracting text content vs. tool calls from the raw response.
|
* **Parsing:** Extracting text content vs. tool calls from the raw response.
|
||||||
|
|
||||||
|
The system supports multiple LLM providers:
|
||||||
|
* **Ollama:** Local models running via Ollama server
|
||||||
|
* **Anthropic:** Claude models via Anthropic API (Story 12)
|
||||||
|
|
||||||
|
Provider selection is **automatic** based on model name:
|
||||||
|
* Model starts with `claude-` → Anthropic provider
|
||||||
|
* Otherwise → Ollama provider
|
||||||
|
|
||||||
## 2. Ollama Implementation
|
## 2. Ollama Implementation
|
||||||
* **Endpoint:** `http://localhost:11434/api/chat`
|
* **Endpoint:** `http://localhost:11434/api/chat`
|
||||||
* **JSON Protocol:**
|
* **JSON Protocol:**
|
||||||
@@ -12,7 +20,82 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
|
|||||||
* Response: Standard Ollama JSON with `message.tool_calls`.
|
* Response: Standard Ollama JSON with `message.tool_calls`.
|
||||||
* **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).
|
* **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).
|
||||||
|
|
||||||
## 3. Chat Loop (Backend)
|
## 3. Anthropic (Claude) Implementation
|
||||||
|
|
||||||
|
### Endpoint
|
||||||
|
* **Base URL:** `https://api.anthropic.com/v1/messages`
|
||||||
|
* **Authentication:** Requires `x-api-key` header with Anthropic API key
|
||||||
|
* **API Version:** `anthropic-version: 2023-06-01` header required
|
||||||
|
|
||||||
|
### API Protocol
|
||||||
|
* **Request Format:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"model": "claude-3-5-sonnet-20241022",
|
||||||
|
"max_tokens": 4096,
|
||||||
|
"messages": [
|
||||||
|
{"role": "user", "content": "Hello"},
|
||||||
|
{"role": "assistant", "content": "Hi!"}
|
||||||
|
],
|
||||||
|
"tools": [...],
|
||||||
|
"stream": true
|
||||||
|
}
|
||||||
|
```
|
||||||
|
* **Response Format (Streaming):**
|
||||||
|
* Server-Sent Events (SSE)
|
||||||
|
* Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
|
||||||
|
* Tool calls appear as `content_block` with `type: "tool_use"`
|
||||||
|
|
||||||
|
### Tool Format Differences
|
||||||
|
Anthropic's tool format differs from Ollama/OpenAI:
|
||||||
|
|
||||||
|
**Anthropic Tool Definition:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"name": "read_file",
|
||||||
|
"description": "Reads a file",
|
||||||
|
"input_schema": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"path": {"type": "string"}
|
||||||
|
},
|
||||||
|
"required": ["path"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Our Internal Format:**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"type": "function",
|
||||||
|
"function": {
|
||||||
|
"name": "read_file",
|
||||||
|
"description": "Reads a file",
|
||||||
|
"parameters": {
|
||||||
|
"type": "object",
|
||||||
|
"properties": {
|
||||||
|
"path": {"type": "string"}
|
||||||
|
},
|
||||||
|
"required": ["path"]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The backend must convert between these formats.
|
||||||
|
|
||||||
|
### Context Windows
|
||||||
|
* **claude-3-5-sonnet-20241022:** 200,000 tokens
|
||||||
|
* **claude-3-5-haiku-20241022:** 200,000 tokens
|
||||||
|
|
||||||
|
### API Key Storage
|
||||||
|
* **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
|
||||||
|
* **Crate:** `keyring` for cross-platform support
|
||||||
|
* **Service Name:** `living-spec-anthropic-api-key`
|
||||||
|
* **Username:** `default`
|
||||||
|
* **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user.
|
||||||
|
|
||||||
|
## 4. Chat Loop (Backend)
|
||||||
The `chat` command acts as the **Agent Loop**:
|
The `chat` command acts as the **Agent Loop**:
|
||||||
1. Frontend sends: `User Message`.
|
1. Frontend sends: `User Message`.
|
||||||
2. Backend appends to `SessionState.history`.
|
2. Backend appends to `SessionState.history`.
|
||||||
@@ -24,6 +107,44 @@ The `chat` command acts as the **Agent Loop**:
|
|||||||
* Backend *re-prompts* Ollama with the new history (recursion).
|
* Backend *re-prompts* Ollama with the new history (recursion).
|
||||||
* Repeat until Text Response or Max Turns reached.
|
* Repeat until Text Response or Max Turns reached.
|
||||||
|
|
||||||
## 4. Frontend State
|
## 5. Model Selection UI
|
||||||
* **Settings:** Store `llm_provider` ("ollama"), `ollama_model` ("llama3.2"), `ollama_base_url`.
|
|
||||||
|
### Unified Dropdown
|
||||||
|
The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:
|
||||||
|
|
||||||
|
```html
|
||||||
|
<select>
|
||||||
|
<optgroup label="Anthropic">
|
||||||
|
<option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
|
||||||
|
<option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
|
||||||
|
</optgroup>
|
||||||
|
<optgroup label="Ollama">
|
||||||
|
<option value="deepseek-r1:70b">deepseek-r1:70b</option>
|
||||||
|
<option value="llama3.1">llama3.1</option>
|
||||||
|
<option value="qwen2.5">qwen2.5</option>
|
||||||
|
</optgroup>
|
||||||
|
</select>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Model List Sources
|
||||||
|
* **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
|
||||||
|
* **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models)
|
||||||
|
|
||||||
|
### API Key Flow
|
||||||
|
1. User selects a Claude model from dropdown
|
||||||
|
2. Frontend sends chat request to backend
|
||||||
|
3. Backend detects `claude-` prefix in model name
|
||||||
|
4. Backend checks OS keychain for stored API key
|
||||||
|
5. If not found:
|
||||||
|
- Backend returns error: "Anthropic API key not found"
|
||||||
|
- Frontend shows dialog prompting for API key
|
||||||
|
- User enters key
|
||||||
|
- Frontend calls `set_anthropic_api_key` command
|
||||||
|
- Backend stores key in OS keychain
|
||||||
|
- User retries chat request
|
||||||
|
6. If found: Backend proceeds with Anthropic API request
|
||||||
|
|
||||||
|
## 6. Frontend State
|
||||||
|
* **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
|
||||||
|
* **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately)
|
||||||
* **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).
|
* **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).
|
||||||
|
|||||||
@@ -31,12 +31,18 @@ To support both Remote and Local models, the system implements a `ModelProvider`
|
|||||||
* Abstract the differences between API formats (OpenAI-compatible vs Anthropic vs Gemini).
|
* Abstract the differences between API formats (OpenAI-compatible vs Anthropic vs Gemini).
|
||||||
* Normalize "Tool Use" definitions, as each provider handles function calling schemas differently.
|
* Normalize "Tool Use" definitions, as each provider handles function calling schemas differently.
|
||||||
* **Supported Providers:**
|
* **Supported Providers:**
|
||||||
* **Anthropic:** Focus on Claude 3.5 Sonnet for coding tasks.
|
|
||||||
* **Google:** Gemini 1.5 Pro for massive context windows.
|
|
||||||
* **Ollama:** Local inference (e.g., Llama 3, DeepSeek Coder) for privacy and offline usage.
|
* **Ollama:** Local inference (e.g., Llama 3, DeepSeek Coder) for privacy and offline usage.
|
||||||
* **Configuration:**
|
* **Anthropic:** Claude 3.5 models (Sonnet, Haiku) via API for coding tasks (Story 12).
|
||||||
* Provider selection is runtime-configurable by the user.
|
* **Provider Selection:**
|
||||||
* API Keys must be stored securely (using OS native keychain where possible).
|
* Automatic detection based on model name prefix:
|
||||||
|
* `claude-` → Anthropic API
|
||||||
|
* Otherwise → Ollama
|
||||||
|
* Single unified model dropdown with section headers ("Anthropic", "Ollama")
|
||||||
|
* **API Key Management:**
|
||||||
|
* Anthropic API key stored in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
|
||||||
|
* Uses `keyring` crate for cross-platform secure storage
|
||||||
|
* On first use of Claude model, user prompted to enter API key
|
||||||
|
* Key persists across sessions (no re-entry needed)
|
||||||
|
|
||||||
## Tooling Capabilities
|
## Tooling Capabilities
|
||||||
|
|
||||||
@@ -90,7 +96,9 @@ To support both Remote and Local models, the system implements a `ModelProvider`
|
|||||||
* `ignore`: Fast recursive directory iteration respecting gitignore.
|
* `ignore`: Fast recursive directory iteration respecting gitignore.
|
||||||
* `walkdir`: Simple directory traversal.
|
* `walkdir`: Simple directory traversal.
|
||||||
* `tokio`: Async runtime.
|
* `tokio`: Async runtime.
|
||||||
* `reqwest`: For LLM API calls (if backend-initiated).
|
* `reqwest`: For LLM API calls (Anthropic, Ollama).
|
||||||
|
* `eventsource-stream`: For Server-Sent Events (Anthropic streaming).
|
||||||
|
* `keyring`: Secure API key storage in OS keychain.
|
||||||
* `uuid`: For unique message IDs.
|
* `uuid`: For unique message IDs.
|
||||||
* `chrono`: For timestamps.
|
* `chrono`: For timestamps.
|
||||||
* `tauri-plugin-dialog`: Native system dialogs.
|
* `tauri-plugin-dialog`: Native system dialogs.
|
||||||
|
|||||||
@@ -0,0 +1,83 @@
|
|||||||
|
# Story 12: Be Able to Use Claude
|
||||||
|
|
||||||
|
## User Story
|
||||||
|
As a user, I want to be able to select Claude (via Anthropic API) as my LLM provider so I can use Claude models instead of only local Ollama models.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
- [ ] Claude models appear in the unified model dropdown (same dropdown as Ollama models)
|
||||||
|
- [ ] Dropdown is organized with section headers: "Anthropic" and "Ollama" with models listed under each
|
||||||
|
- [ ] When user first selects a Claude model, a dialog prompts for Anthropic API key
|
||||||
|
- [ ] API key is stored securely in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
|
||||||
|
- [ ] Provider is auto-detected from model name (starts with `claude-` = Anthropic, otherwise = Ollama)
|
||||||
|
- [ ] Chat requests route to Anthropic API when Claude model is selected
|
||||||
|
- [ ] Streaming responses work with Claude (token-by-token display)
|
||||||
|
- [ ] Tool calling works with Claude (using Anthropic's tool format)
|
||||||
|
- [ ] Context window calculation accounts for Claude models (200k tokens)
|
||||||
|
- [ ] User's model selection persists between sessions
|
||||||
|
- [ ] Clear error messages if API key is missing or invalid
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
- Support for other providers (OpenAI, Google, etc.) - can be added later
|
||||||
|
- API key management UI (rotation, multiple keys, view/edit key after initial entry)
|
||||||
|
- Cost tracking or usage monitoring
|
||||||
|
- Model fine-tuning or custom models
|
||||||
|
- Switching models mid-conversation (user can start new session)
|
||||||
|
- Fetching available Claude models from API (hardcoded list is fine)
|
||||||
|
|
||||||
|
## Technical Notes
|
||||||
|
- Anthropic API endpoint: `https://api.anthropic.com/v1/messages`
|
||||||
|
- API key should be stored securely (environment variable or secure storage)
|
||||||
|
- Claude models support tool use (function calling)
|
||||||
|
- Context windows: claude-3-5-sonnet (200k), claude-3-5-haiku (200k)
|
||||||
|
- Streaming uses Server-Sent Events (SSE)
|
||||||
|
- Tool format differs from OpenAI/Ollama - needs conversion
|
||||||
|
|
||||||
|
## Design Considerations
|
||||||
|
- Single unified model dropdown with section headers ("Anthropic", "Ollama")
|
||||||
|
- Use `<optgroup>` in HTML select for visual grouping
|
||||||
|
- API key dialog appears on-demand (first use of Claude model)
|
||||||
|
- Store API key in OS keychain using `keyring` crate (cross-platform)
|
||||||
|
- Backend auto-detects provider from model name pattern
|
||||||
|
- Handle API key in backend only (don't expose to frontend logs)
|
||||||
|
- Alphabetical sorting within each provider section
|
||||||
|
|
||||||
|
## Implementation Approach
|
||||||
|
|
||||||
|
### Backend (Rust)
|
||||||
|
1. Add `anthropic` feature/module for Claude API client
|
||||||
|
2. Create `AnthropicClient` with streaming support
|
||||||
|
3. Convert tool definitions to Anthropic format
|
||||||
|
4. Handle Anthropic streaming response format
|
||||||
|
5. Add API key storage (encrypted or environment variable)
|
||||||
|
|
||||||
|
### Frontend (TypeScript)
|
||||||
|
1. Add hardcoded list of Claude models (claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022)
|
||||||
|
2. Merge Ollama and Claude models into single dropdown with `<optgroup>` sections
|
||||||
|
3. Create API key input dialog/modal component
|
||||||
|
4. Trigger API key dialog when Claude model selected and no key stored
|
||||||
|
5. Add Tauri command to check if API key exists in keychain
|
||||||
|
6. Add Tauri command to set API key in keychain
|
||||||
|
7. Update context window calculations for Claude models (200k tokens)
|
||||||
|
|
||||||
|
### API Differences
|
||||||
|
- Anthropic uses `messages` array format (similar to OpenAI)
|
||||||
|
- Tools are called `tools` with different schema
|
||||||
|
- Streaming events have different structure
|
||||||
|
- Need to map our tool format to Anthropic's format
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
- API key stored in OS keychain (not in files or environment variables)
|
||||||
|
- Use `keyring` crate for cross-platform secure storage
|
||||||
|
- Never log API key in console or files
|
||||||
|
- Backend validates API key format before making requests
|
||||||
|
- Handle API errors gracefully (rate limits, invalid key, network errors)
|
||||||
|
- API key only accessible to the app process
|
||||||
|
|
||||||
|
## UI Flow
|
||||||
|
1. User opens model dropdown → sees "Anthropic" section with Claude models, "Ollama" section with local models
|
||||||
|
2. User selects `claude-3-5-sonnet-20241022`
|
||||||
|
3. Backend checks keychain for stored API key
|
||||||
|
4. If not found → Frontend shows dialog: "Enter your Anthropic API key"
|
||||||
|
5. User enters key → Backend stores in OS keychain
|
||||||
|
6. Chat proceeds with Anthropic API
|
||||||
|
7. Future sessions: API key auto-loaded from keychain (no prompt)
|
||||||
Reference in New Issue
Block a user