1.5 KiB
1.5 KiB
Functional Spec: AI Integration
1. Provider Abstraction
The system uses a pluggable architecture for LLMs. The ModelProvider interface abstracts:
- Generation: Sending prompt + history + tools to the model.
- Parsing: Extracting text content vs. tool calls from the raw response.
2. Ollama Implementation
- Endpoint:
http://localhost:11434/api/chat - JSON Protocol:
- Request:
{ model: "name", messages: [...], stream: false, tools: [...] } - Response: Standard Ollama JSON with
message.tool_calls.
- Request:
- Fallback: If the specific local model doesn't support native tool calling, we may need a fallback system prompt approach, but for this story, we assume a tool-capable model (like
llama3.1ormistral-nemo).
3. Chat Loop (Backend)
The chat command acts as the Agent Loop:
- Frontend sends:
User Message. - Backend appends to
SessionState.history. - Backend calls
OllamaProvider. - If Text Response: Return text to Frontend.
- If Tool Call:
- Backend executes the Tool (using the Core Tools from Story #2).
- Backend appends
ToolResultto history. - Backend re-prompts Ollama with the new history (recursion).
- Repeat until Text Response or Max Turns reached.
4. Frontend State
- Settings: Store
llm_provider("ollama"),ollama_model("llama3.2"),ollama_base_url. - Chat: Display the conversation. Tool calls should be visible as "System Events" (e.g., collapsed accordions).