5.1 KiB
5.1 KiB
Functional Spec: AI Integration
1. Provider Abstraction
The system uses a pluggable architecture for LLMs. The ModelProvider interface abstracts:
- Generation: Sending prompt + history + tools to the model.
- Parsing: Extracting text content vs. tool calls from the raw response.
The system supports multiple LLM providers:
- Ollama: Local models running via Ollama server
- Anthropic: Claude models via Anthropic API (Story 12)
Provider selection is automatic based on model name:
- Model starts with
claude-→ Anthropic provider - Otherwise → Ollama provider
2. Ollama Implementation
- Endpoint:
http://localhost:11434/api/chat - JSON Protocol:
- Request:
{ model: "name", messages: [...], stream: false, tools: [...] } - Response: Standard Ollama JSON with
message.tool_calls.
- Request:
- Fallback: If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like
llama3.1ormistral-nemo).
3. Anthropic (Claude) Implementation
Endpoint
- Base URL:
https://api.anthropic.com/v1/messages - Authentication: Requires
x-api-keyheader with Anthropic API key - API Version:
anthropic-version: 2023-06-01header required
API Protocol
- Request Format:
{ "model": "claude-3-5-sonnet-20241022", "max_tokens": 4096, "messages": [ {"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi!"} ], "tools": [...], "stream": true } - Response Format (Streaming):
- Server-Sent Events (SSE)
- Event types:
message_start,content_block_start,content_block_delta,content_block_stop,message_stop - Tool calls appear as
content_blockwithtype: "tool_use"
Tool Format Differences
Anthropic's tool format differs from Ollama/OpenAI:
Anthropic Tool Definition:
{
"name": "read_file",
"description": "Reads a file",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
}
}
Our Internal Format:
{
"type": "function",
"function": {
"name": "read_file",
"description": "Reads a file",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"}
},
"required": ["path"]
}
}
}
The backend must convert between these formats.
Context Windows
- claude-3-5-sonnet-20241022: 200,000 tokens
- claude-3-5-haiku-20241022: 200,000 tokens
API Key Storage
- Storage: OS keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service)
- Crate:
keyringfor cross-platform support - Service Name:
living-spec-anthropic-api-key - Username:
default - Retrieval: On first use of Claude model, check keychain. If not found, prompt user.
4. Chat Loop (Backend)
The chat command acts as the Agent Loop:
- Frontend sends:
User Message. - Backend appends to
SessionState.history. - Backend calls
OllamaProvider. - If Text Response: Return text to Frontend.
- If Tool Call:
- Backend executes the Tool (using the Core Tools from Story #2).
- Backend appends
ToolResultto history. - Backend re-prompts Ollama with the new history (recursion).
- Repeat until Text Response or Max Turns reached.
5. Model Selection UI
Unified Dropdown
The model selection dropdown combines both Ollama and Anthropic models in a single list, organized by provider:
<select>
<optgroup label="Anthropic">
<option value="claude-3-5-sonnet-20241022">claude-3-5-sonnet-20241022</option>
<option value="claude-3-5-haiku-20241022">claude-3-5-haiku-20241022</option>
</optgroup>
<optgroup label="Ollama">
<option value="deepseek-r1:70b">deepseek-r1:70b</option>
<option value="llama3.1">llama3.1</option>
<option value="qwen2.5">qwen2.5</option>
</optgroup>
</select>
Model List Sources
- Ollama: Fetched from
http://localhost:11434/api/tagsviaget_ollama_modelscommand - Anthropic: Hardcoded list of supported Claude models (no API to fetch available models)
API Key Flow
- User selects a Claude model from dropdown
- Frontend sends chat request to backend
- Backend detects
claude-prefix in model name - Backend checks OS keychain for stored API key
- If not found:
- Backend returns error: "Anthropic API key not found"
- Frontend shows dialog prompting for API key
- User enters key
- Frontend calls
set_anthropic_api_keycommand - Backend stores key in OS keychain
- User retries chat request
- If found: Backend proceeds with Anthropic API request
6. Frontend State
- Settings: Store
selected_model(e.g., "claude-3-5-sonnet-20241022" or "llama3.1") - Provider Detection: Auto-detected from model name (frontend doesn't need to track provider separately)
- Chat: Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).