Story 12: Update story and specs for Claude integration

Story Updates:
- Unified model dropdown with section headers (Anthropic, Ollama)
- Auto-detect provider from model name (claude-* prefix)
- API key prompt on first Claude model use
- Secure storage in OS keychain via keyring crate
- 200k token context window for Claude models

Spec Updates (AI_INTEGRATION.md):
- Document Anthropic provider implementation
- Anthropic API protocol (SSE streaming, tool format)
- Tool format conversion between internal and Anthropic formats
- API key storage in OS keychain
- Unified dropdown UI flow

Spec Updates (STACK.md):
- Add keyring crate for secure API key storage
- Add eventsource-stream for Anthropic SSE streaming
- Document automatic provider detection
- Update API key management approach
This commit is contained in:
Dave
2025-12-27 19:37:01 +00:00
parent ca7efc2888
commit e71dcd8226
3 changed files with 221 additions and 9 deletions

View File

@@ -5,6 +5,14 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
* **Generation:** Sending prompt + history + tools to the model.
* **Parsing:** Extracting text content vs. tool calls from the raw response.
The system supports multiple LLM providers:
* **Ollama:** Local models running via Ollama server
* **Anthropic:** Claude models via Anthropic API (Story 12)
Provider selection is **automatic** based on model name:
* Model starts with `claude-` → Anthropic provider
* Otherwise → Ollama provider
## 2. Ollama Implementation
* **Endpoint:** `http://localhost:11434/api/chat`
* **JSON Protocol:**
@@ -12,7 +20,82 @@ The system uses a pluggable architecture for LLMs. The `ModelProvider` interface
* Response: Standard Ollama JSON with `message.tool_calls`.
* **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).
## 3. Chat Loop (Backend)
## 3. Anthropic (Claude) Implementation
### Endpoint
* **Base URL:** `https://api.anthropic.com/v1/messages`
* **Authentication:** Requires `x-api-key` header with Anthropic API key
* **API Version:** `anthropic-version: 2023-06-01` header required
### API Protocol
* **Request Format:**
```json
{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 4096,
"messages": [
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi!"}
],
"tools": [...],
"stream": true
}
```
* **Response Format (Streaming):**
* Server-Sent Events (SSE)
* Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
* Tool calls appear as `content_block` with `type: "tool_use"`
### Tool Format Differences
Anthropic's tool format differs from Ollama/OpenAI:
**Anthropic Tool Definition:**
```json
{
"name": "read_file",
"description": "Reads a file",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
}
}
```
**Our Internal Format:**
```json
{
"type": "function",
"function": {
"name": "read_file",
"description": "Reads a file",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
}
}
}
```
The backend must convert between these formats.
### Context Windows
* **claude-3-5-sonnet-20241022:** 200,000 tokens
* **claude-3-5-haiku-20241022:** 200,000 tokens
### API Key Storage
* **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
* **Crate:** `keyring` for cross-platform support
* **Service Name:** `living-spec-anthropic-api-key`
* **Username:** `default`
* **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user.
## 4. Chat Loop (Backend)
The `chat` command acts as the **Agent Loop**:
1. Frontend sends: `User Message`.
2. Backend appends to `SessionState.history`.
@@ -24,6 +107,44 @@ The `chat` command acts as the **Agent Loop**:
* Backend *re-prompts* Ollama with the new history (recursion).
* Repeat until Text Response or Max Turns reached.
## 4. Frontend State
* **Settings:** Store `llm_provider` ("ollama"), `ollama_model` ("llama3.2"), `ollama_base_url`.
## 5. Model Selection UI
### Unified Dropdown
The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:
```html
<select>
<optgroup label="Anthropic">
<option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
<option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
</optgroup>
<optgroup label="Ollama">
<option value="deepseek-r1:70b">deepseek-r1:70b</option>
<option value="llama3.1">llama3.1</option>
<option value="qwen2.5">qwen2.5</option>
</optgroup>
</select>
```
### Model List Sources
* **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
* **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models)
### API Key Flow
1. User selects a Claude model from dropdown
2. Frontend sends chat request to backend
3. Backend detects `claude-` prefix in model name
4. Backend checks OS keychain for stored API key
5. If not found:
- Backend returns error: "Anthropic API key not found"
- Frontend shows dialog prompting for API key
- User enters key
- Frontend calls `set_anthropic_api_key` command
- Backend stores key in OS keychain
- User retries chat request
6. If found: Backend proceeds with Anthropic API request
## 6. Frontend State
* **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
* **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately)
* **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).

View File

@@ -31,12 +31,18 @@ To support both Remote and Local models, the system implements a `ModelProvider`
* Abstract the differences between API formats (OpenAI-compatible vs Anthropic vs Gemini).
* Normalize "Tool Use" definitions, as each provider handles function calling schemas differently.
* **Supported Providers:**
* **Anthropic:** Focus on Claude 3.5 Sonnet for coding tasks.
* **Google:** Gemini 1.5 Pro for massive context windows.
* **Ollama:** Local inference (e.g., Llama 3, DeepSeek Coder) for privacy and offline usage.
* **Configuration:**
* Provider selection is runtime-configurable by the user.
* API Keys must be stored securely (using OS native keychain where possible).
* **Anthropic:** Claude 3.5 models (Sonnet, Haiku) via API for coding tasks (Story 12).
* **Provider Selection:**
* Automatic detection based on model name prefix:
* `claude-` → Anthropic API
* Otherwise → Ollama
* Single unified model dropdown with section headers ("Anthropic", "Ollama")
* **API Key Management:**
* Anthropic API key stored in OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
* Uses `keyring` crate for cross-platform secure storage
* On first use of Claude model, user prompted to enter API key
* Key persists across sessions (no re-entry needed)
## Tooling Capabilities
@@ -90,7 +96,9 @@ To support both Remote and Local models, the system implements a `ModelProvider`
* `ignore`: Fast recursive directory iteration respecting gitignore.
* `walkdir`: Simple directory traversal.
* `tokio`: Async runtime.
* `reqwest`: For LLM API calls (if backend-initiated).
* `reqwest`: For LLM API calls (Anthropic, Ollama).
* `eventsource-stream`: For Server-Sent Events (Anthropic streaming).
* `keyring`: Secure API key storage in OS keychain.
* `uuid`: For unique message IDs.
* `chrono`: For timestamps.
* `tauri-plugin-dialog`: Native system dialogs.