Files

Dave e71dcd8226 Story 12: Update story and specs for Claude integration

Story Updates:
- Unified model dropdown with section headers (Anthropic, Ollama)
- Auto-detect provider from model name (claude-* prefix)
- API key prompt on first Claude model use
- Secure storage in OS keychain via keyring crate
- 200k token context window for Claude models

Spec Updates (AI_INTEGRATION.md):
- Document Anthropic provider implementation
- Anthropic API protocol (SSE streaming, tool format)
- Tool format conversion between internal and Anthropic formats
- API key storage in OS keychain
- Unified dropdown UI flow

Spec Updates (STACK.md):
- Add keyring crate for secure API key storage
- Add eventsource-stream for Anthropic SSE streaming
- Document automatic provider detection
- Update API key management approach

2025-12-27 19:37:01 +00:00

5.1 KiB

Raw Blame History

Functional Spec: AI Integration

1. Provider Abstraction

The system uses a pluggable architecture for LLMs. The ModelProvider interface abstracts:

Generation: Sending prompt + history + tools to the model.
Parsing: Extracting text content vs. tool calls from the raw response.

The system supports multiple LLM providers:

Ollama: Local models running via Ollama server
Anthropic: Claude models via Anthropic API (Story 12)

Provider selection is automatic based on model name:

Model starts with claude- → Anthropic provider
Otherwise → Ollama provider

2. Ollama Implementation

Endpoint: http://localhost:11434/api/chat
JSON Protocol:
- Request: { model: "name", messages: [...], stream: false, tools: [...] }
- Response: Standard Ollama JSON with message.tool_calls.
Fallback: If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like llama3.1 or mistral-nemo).

3. Anthropic (Claude) Implementation

Endpoint

Base URL: https://api.anthropic.com/v1/messages
Authentication: Requires x-api-key header with Anthropic API key
API Version: anthropic-version: 2023-06-01 header required

API Protocol

Request Format:

{
  "model": "claude-3-5-sonnet-20241022",
  "max_tokens": 4096,
  "messages": [
    {"role": "user", "content": "Hello"},
    {"role": "assistant", "content": "Hi!"}
  ],
  "tools": [...],
  "stream": true
}

Response Format (Streaming):
- Server-Sent Events (SSE)
- Event types: message_start, content_block_start, content_block_delta, content_block_stop, message_stop
- Tool calls appear as content_block with type: "tool_use"

Tool Format Differences

Anthropic's tool format differs from Ollama/OpenAI:

Anthropic Tool Definition:

{
  "name": "read_file",
  "description": "Reads a file",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": {"type": "string"}
    },
    "required": ["path"]
  }
}

Our Internal Format:

{
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Reads a file",
    "parameters": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      },
      "required": ["path"]
    }
  }
}

The backend must convert between these formats.

Context Windows

claude-3-5-sonnet-20241022: 200,000 tokens
claude-3-5-haiku-20241022: 200,000 tokens

API Key Storage

Storage: OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
Crate: keyring for cross-platform support
Service Name: living-spec-anthropic-api-key
Username: default
Retrieval: On first use of Claude model, check keychain. If not found, prompt user.

4. Chat Loop (Backend)

The chat command acts as the Agent Loop:

Frontend sends: User Message.
Backend appends to SessionState.history.
Backend calls OllamaProvider.
If Text Response: Return text to Frontend.
If Tool Call:
- Backend executes the Tool (using the Core Tools from Story #2).
- Backend appends ToolResult to history.
- Backend re-prompts Ollama with the new history (recursion).
- Repeat until Text Response or Max Turns reached.

5. Model Selection UI

The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:

<select>
  <optgroup label="Anthropic">
    <option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
    <option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
  </optgroup>
  <optgroup label="Ollama">
    <option value="deepseek-r1:70b">deepseek-r1:70b</option>
    <option value="llama3.1">llama3.1</option>
    <option value="qwen2.5">qwen2.5</option>
  </optgroup>
</select>

Model List Sources

Ollama: Fetched from http://localhost:11434/api/tags via get_ollama_models command
Anthropic: Hardcoded list of supported Claude models (no API to fetch available models)

API Key Flow

User selects a Claude model from dropdown
Frontend sends chat request to backend
Backend detects claude- prefix in model name
Backend checks OS keychain for stored API key
If not found:
- Backend returns error: "Anthropic API key not found"
- Frontend shows dialog prompting for API key
- User enters key
- Frontend calls set_anthropic_api_key command
- Backend stores key in OS keychain
- User retries chat request
If found: Backend proceeds with Anthropic API request

6. Frontend State

Settings: Store selected_model (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
Provider Detection: Auto-detected from model name (frontend doesn't need to track provider separately)
Chat: Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).

5.1 KiB Raw Blame History