Files
storkit/.living_spec/specs/functional/AI_INTEGRATION.md
Dave e71dcd8226 Story 12: Update story and specs for Claude integration
Story Updates:
- Unified model dropdown with section headers (Anthropic, Ollama)
- Auto-detect provider from model name (claude-* prefix)
- API key prompt on first Claude model use
- Secure storage in OS keychain via keyring crate
- 200k token context window for Claude models

Spec Updates (AI_INTEGRATION.md):
- Document Anthropic provider implementation
- Anthropic API protocol (SSE streaming, tool format)
- Tool format conversion between internal and Anthropic formats
- API key storage in OS keychain
- Unified dropdown UI flow

Spec Updates (STACK.md):
- Add keyring crate for secure API key storage
- Add eventsource-stream for Anthropic SSE streaming
- Document automatic provider detection
- Update API key management approach
2025-12-27 19:37:01 +00:00

5.1 KiB

Functional Spec: AI Integration

1. Provider Abstraction

The system uses a pluggable architecture for LLMs. The ModelProvider interface abstracts:

  • Generation: Sending prompt + history + tools to the model.
  • Parsing: Extracting text content vs. tool calls from the raw response.

The system supports multiple LLM providers:

  • Ollama: Local models running via Ollama server
  • Anthropic: Claude models via Anthropic API (Story 12)

Provider selection is automatic based on model name:

  • Model starts with claude- → Anthropic provider
  • Otherwise → Ollama provider

2. Ollama Implementation

  • Endpoint: http://localhost:11434/api/chat
  • JSON Protocol:
    • Request: { model: "name", messages: [...], stream: false, tools: [...] }
    • Response: Standard Ollama JSON with message.tool_calls.
  • Fallback: If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like llama3.1 or mistral-nemo).

3. Anthropic (Claude) Implementation

Endpoint

  • Base URL: https://api.anthropic.com/v1/messages
  • Authentication: Requires x-api-key header with Anthropic API key
  • API Version: anthropic-version: 2023-06-01 header required

API Protocol

  • Request Format:
    {
      "model": "claude-3-5-sonnet-20241022",
      "max_tokens": 4096,
      "messages": [
        {"role": "user", "content": "Hello"},
        {"role": "assistant", "content": "Hi!"}
      ],
      "tools": [...],
      "stream": true
    }
    
  • Response Format (Streaming):
    • Server-Sent Events (SSE)
    • Event types: message_start, content_block_start, content_block_delta, content_block_stop, message_stop
    • Tool calls appear as content_block with type: "tool_use"

Tool Format Differences

Anthropic's tool format differs from Ollama/OpenAI:

Anthropic Tool Definition:

{
  "name": "read_file",
  "description": "Reads a file",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": {"type": "string"}
    },
    "required": ["path"]
  }
}

Our Internal Format:

{
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Reads a file",
    "parameters": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      },
      "required": ["path"]
    }
  }
}

The backend must convert between these formats.

Context Windows

  • claude-3-5-sonnet-20241022: 200,000 tokens
  • claude-3-5-haiku-20241022: 200,000 tokens

API Key Storage

  • Storage: OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
  • Crate: keyring for cross-platform support
  • Service Name: living-spec-anthropic-api-key
  • Username: default
  • Retrieval: On first use of Claude model, check keychain. If not found, prompt user.

4. Chat Loop (Backend)

The chat command acts as the Agent Loop:

  1. Frontend sends: User Message.
  2. Backend appends to SessionState.history.
  3. Backend calls OllamaProvider.
  4. If Text Response: Return text to Frontend.
  5. If Tool Call:
    • Backend executes the Tool (using the Core Tools from Story #2).
    • Backend appends ToolResult to history.
    • Backend re-prompts Ollama with the new history (recursion).
    • Repeat until Text Response or Max Turns reached.

5. Model Selection UI

Unified Dropdown

The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:

<select>
  <optgroup label="Anthropic">
    <option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
    <option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
  </optgroup>
  <optgroup label="Ollama">
    <option value="deepseek-r1:70b">deepseek-r1:70b</option>
    <option value="llama3.1">llama3.1</option>
    <option value="qwen2.5">qwen2.5</option>
  </optgroup>
</select>

Model List Sources

  • Ollama: Fetched from http://localhost:11434/api/tags via get_ollama_models command
  • Anthropic: Hardcoded list of supported Claude models (no API to fetch available models)

API Key Flow

  1. User selects a Claude model from dropdown
  2. Frontend sends chat request to backend
  3. Backend detects claude- prefix in model name
  4. Backend checks OS keychain for stored API key
  5. If not found:
    • Backend returns error: "Anthropic API key not found"
    • Frontend shows dialog prompting for API key
    • User enters key
    • Frontend calls set_anthropic_api_key command
    • Backend stores key in OS keychain
    • User retries chat request
  6. If found: Backend proceeds with Anthropic API request

6. Frontend State

  • Settings: Store selected_model (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
  • Provider Detection: Auto-detected from model name (frontend doesn't need to track provider separately)
  • Chat: Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).