.story_kit/specs/functional/AI_INTEGRATION.md

# Functional Spec: AI Integration

## 1. Provider Abstraction
The system uses a pluggable architecture for LLMs. The `ModelProvider` interface abstracts:
*   **Generation:** Sending prompt + history + tools to the model.
*   **Parsing:** Extracting text content vs. tool calls from the raw response.

The system supports multiple LLM providers:
*   **Ollama:** Local models running via Ollama server
*   **Anthropic:** Claude models via Anthropic API (Story 12)

Provider selection is **automatic** based on model name:
*   Model starts with `claude-` → Anthropic provider
*   Otherwise → Ollama provider

## 2. Ollama Implementation
*   **Endpoint:** `http://localhost:11434/api/chat`
*   **JSON Protocol:**
    *   Request: `{ model: "name", messages: [...], stream: false, tools: [...] }`
    *   Response: Standard Ollama JSON with `message.tool_calls`.
*   **Fallback:** If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).

## 3. Anthropic (Claude) Implementation

### Endpoint
*   **Base URL:** `https://api.anthropic.com/v1/messages`
*   **Authentication:** Requires `x-api-key` header with Anthropic API key
*   **API Version:** `anthropic-version: 2023-06-01` header required

### API Protocol
*   **Request Format:**
    ```json
    {
      "model": "claude-3-5-sonnet-20241022",
      "max_tokens": 4096,
      "messages": [
        {"role": "user", "content": "Hello"},
        {"role": "assistant", "content": "Hi!"}
      ],
      "tools": [...],
      "stream": true
    }
    ```
*   **Response Format (Streaming):**
    *   Server-Sent Events (SSE)
    *   Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
    *   Tool calls appear as `content_block` with `type: "tool_use"`

### Tool Format Differences
Anthropic's tool format differs from Ollama/OpenAI:

**Anthropic Tool Definition:**
```json
{
  "name": "read_file",
  "description": "Reads a file",
  "input_schema": {
    "type": "object",
    "properties": {
      "path": {"type": "string"}
    },
    "required": ["path"]
  }
}
```

**Our Internal Format:**
```json
{
  "type": "function",
  "function": {
    "name": "read_file",
    "description": "Reads a file",
    "parameters": {
      "type": "object",
      "properties": {
        "path": {"type": "string"}
      },
      "required": ["path"]
    }
  }
}
```

The backend must convert between these formats.

### Context Windows
*   **claude-3-5-sonnet-20241022:** 200,000 tokens
*   **claude-3-5-haiku-20241022:** 200,000 tokens

### API Key Storage
*   **Storage:** OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
*   **Crate:** `keyring` for cross-platform support
*   **Service Name:** `living-spec-anthropic-api-key`
*   **Username:** `default`
*   **Retrieval:** On first use of Claude model, check keychain. If not found, prompt user.

## 4. Chat Loop (Backend)
The `chat` command acts as the **Agent Loop**:
1.  Frontend sends: `User Message`.
2.  Backend appends to `SessionState.history`.
3.  Backend calls `OllamaProvider`.
4.  **If Text Response:** Return text to Frontend.
5.  **If Tool Call:**
    *   Backend executes the Tool (using the Core Tools from Story #2).
    *   Backend appends `ToolResult` to history.
    *   Backend *re-prompts* Ollama with the new history (recursion).
    *   Repeat until Text Response or Max Turns reached.

## 5. Model Selection UI

### Unified Dropdown
The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:

```html
<select>
  <optgroup label="Anthropic">
    <option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
    <option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
  </optgroup>
  <optgroup label="Ollama">
    <option value="deepseek-r1:70b">deepseek-r1:70b</option>
    <option value="llama3.1">llama3.1</option>
    <option value="qwen2.5">qwen2.5</option>
  </optgroup>
</select>
```

### Model List Sources
*   **Ollama:** Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
*   **Anthropic:** Hardcoded list of supported Claude models (no API to fetch available models)

### API Key Flow
1. User selects a Claude model from dropdown
2. Frontend sends chat request to backend
3. Backend detects `claude-` prefix in model name
4. Backend checks OS keychain for stored API key
5. If not found:
   - Backend returns error: "Anthropic API key not found"
   - Frontend shows dialog prompting for API key
   - User enters key
   - Frontend calls `set_anthropic_api_key` command
   - Backend stores key in OS keychain
   - User retries chat request
6. If found: Backend proceeds with Anthropic API request

## 6. Frontend State
*   **Settings:** Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
*   **Provider Detection:** Auto-detected from model name (frontend doesn't need to track provider separately)
*   **Chat:** Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).
feat: agent brain (ollama) and chat ui 2025-12-24 17:17:35 +00:00			`# Functional Spec: AI Integration`

			`## 1. Provider Abstraction`
			The system uses a pluggable architecture for LLMs. The `ModelProvider` interface abstracts:
			`* Generation: Sending prompt + history + tools to the model.`
			`* Parsing: Extracting text content vs. tool calls from the raw response.`

Story 12: Update story and specs for Claude integration Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach 2025-12-27 19:37:01 +00:00			`The system supports multiple LLM providers:`
			`* Ollama: Local models running via Ollama server`
			`* Anthropic: Claude models via Anthropic API (Story 12)`

			`Provider selection is automatic based on model name:`
			* Model starts with `claude-` → Anthropic provider
			`* Otherwise → Ollama provider`

feat: agent brain (ollama) and chat ui 2025-12-24 17:17:35 +00:00			`## 2. Ollama Implementation`
			* Endpoint: `http://localhost:11434/api/chat`
			`* JSON Protocol:`
			* Request: `{ model: "name", messages: [...], stream: false, tools: [...] }`
			* Response: Standard Ollama JSON with `message.tool_calls`.
			* Fallback: If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like `llama3.1` or `mistral-nemo`).

Story 12: Update story and specs for Claude integration Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach 2025-12-27 19:37:01 +00:00			`## 3. Anthropic (Claude) Implementation`

			`### Endpoint`
			* Base URL: `https://api.anthropic.com/v1/messages`
			* Authentication: Requires `x-api-key` header with Anthropic API key
			* API Version: `anthropic-version: 2023-06-01` header required

			`### API Protocol`
			`* Request Format:`
			```json
			`{`
			`"model": "claude-3-5-sonnet-20241022",`
			`"max_tokens": 4096,`
			`"messages": [`
			`{"role": "user", "content": "Hello"},`
			`{"role": "assistant", "content": "Hi!"}`
			`],`
			`"tools": [...],`
			`"stream": true`
			`}`
			```
			`* Response Format (Streaming):`
			`* Server-Sent Events (SSE)`
			* Event types: `message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_stop`
			* Tool calls appear as `content_block` with `type: "tool_use"`

			`### Tool Format Differences`
			`Anthropic's tool format differs from Ollama/OpenAI:`

			`Anthropic Tool Definition:`
			```json
			`{`
			`"name": "read_file",`
			`"description": "Reads a file",`
			`"input_schema": {`
			`"type": "object",`
			`"properties": {`
			`"path": {"type": "string"}`
			`},`
			`"required": ["path"]`
			`}`
			`}`
			```

			`Our Internal Format:`
			```json
			`{`
			`"type": "function",`
			`"function": {`
			`"name": "read_file",`
			`"description": "Reads a file",`
			`"parameters": {`
			`"type": "object",`
			`"properties": {`
			`"path": {"type": "string"}`
			`},`
			`"required": ["path"]`
			`}`
			`}`
			`}`
			```

			`The backend must convert between these formats.`

			`### Context Windows`
			`* claude-3-5-sonnet-20241022: 200,000 tokens`
			`* claude-3-5-haiku-20241022: 200,000 tokens`

			`### API Key Storage`
			`* Storage: OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)`
			* Crate: `keyring` for cross-platform support
			* Service Name: `living-spec-anthropic-api-key`
			* Username: `default`
			`* Retrieval: On first use of Claude model, check keychain. If not found, prompt user.`

			`## 4. Chat Loop (Backend)`
feat: agent brain (ollama) and chat ui 2025-12-24 17:17:35 +00:00			The `chat` command acts as the Agent Loop:
			1. Frontend sends: `User Message`.
			2. Backend appends to `SessionState.history`.
			3. Backend calls `OllamaProvider`.
			`4. If Text Response: Return text to Frontend.`
			`5. If Tool Call:`
			`* Backend executes the Tool (using the Core Tools from Story #2).`
			* Backend appends `ToolResult` to history.
			`* Backend re-prompts Ollama with the new history (recursion).`
			`* Repeat until Text Response or Max Turns reached.`

Story 12: Update story and specs for Claude integration Story Updates: - Unified model dropdown with section headers (Anthropic, Ollama) - Auto-detect provider from model name (claude-* prefix) - API key prompt on first Claude model use - Secure storage in OS keychain via keyring crate - 200k token context window for Claude models Spec Updates (AI_INTEGRATION.md): - Document Anthropic provider implementation - Anthropic API protocol (SSE streaming, tool format) - Tool format conversion between internal and Anthropic formats - API key storage in OS keychain - Unified dropdown UI flow Spec Updates (STACK.md): - Add keyring crate for secure API key storage - Add eventsource-stream for Anthropic SSE streaming - Document automatic provider detection - Update API key management approach 2025-12-27 19:37:01 +00:00			`## 5. Model Selection UI`

			`### Unified Dropdown`
			`The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:`

			```html
			`<select>`
			`<optgroup label="Anthropic">`
			`<option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>`
			`<option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>`
			`</optgroup>`
			`<optgroup label="Ollama">`
			`<option value="deepseek-r1:70b">deepseek-r1:70b</option>`
			`<option value="llama3.1">llama3.1</option>`
			`<option value="qwen2.5">qwen2.5</option>`
			`</optgroup>`
			`</select>`
			```

			`### Model List Sources`
			* Ollama: Fetched from `http://localhost:11434/api/tags` via `get_ollama_models` command
			`* Anthropic: Hardcoded list of supported Claude models (no API to fetch available models)`

			`### API Key Flow`
			`1. User selects a Claude model from dropdown`
			`2. Frontend sends chat request to backend`
			3. Backend detects `claude-` prefix in model name
			`4. Backend checks OS keychain for stored API key`
			`5. If not found:`
			`- Backend returns error: "Anthropic API key not found"`
			`- Frontend shows dialog prompting for API key`
			`- User enters key`
			- Frontend calls `set_anthropic_api_key` command
			`- Backend stores key in OS keychain`
			`- User retries chat request`
			`6. If found: Backend proceeds with Anthropic API request`

			`## 6. Frontend State`
			* Settings: Store `selected_model` (e.g., "claude-3-5-sonnet-20241022" or "llama3.1")
			`* Provider Detection: Auto-detected from model name (frontend doesn't need to track provider separately)`
feat: agent brain (ollama) and chat ui 2025-12-24 17:17:35 +00:00			`* Chat: Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).`