Refocus workflow on TDD and reorganize stories

This commit is contained in:
Dave
2026-02-17 13:34:32 +00:00
parent 1f4f10930f
commit 4c887d93b5
42 changed files with 155 additions and 498 deletions

View File

@@ -0,0 +1,18 @@
# Story: Project Selection & Read Verification
## User Story
**As a** User
**I want to** select a local folder on my computer as the "Target Project"
**So that** the assistant knows which codebase to analyze and work on.
## Acceptance Criteria
* [ ] UI has an "Open Project" button.
* [ ] Clicking the button opens the native OS folder picker.
* [ ] Upon selection, the UI displays the selected path.
* [ ] The system verifies the folder exists and is readable.
* [ ] The application state persists the "Current Project" (in memory is fine for now).
## Out of Scope
* Persisting the selection across app restarts (save that for later).
* Scanning the file tree (just verify the root exists).
* Git validation (we'll assume any folder is valid for now).

View File

@@ -0,0 +1,20 @@
# Story: Core Agent Tools (The Hands)
## User Story
**As an** Agent
**I want to** be able to read files, list directories, search content, and execute shell commands
**So that** I can autonomously explore and modify the target project.
## Acceptance Criteria
* [ ] Rust Backend: Implement `read_file(path)` command (scoped to project).
* [ ] Rust Backend: Implement `write_file(path, content)` command (scoped to project).
* [ ] Rust Backend: Implement `list_directory(path)` command.
* [ ] Rust Backend: Implement `exec_shell(command, args)` command.
* [ ] Must enforce allowlist (git, cargo, npm, etc).
* [ ] Must run in project root.
* [ ] Rust Backend: Implement `search_files(query, globs)` using `ignore` crate.
* [ ] Frontend: Expose these as tools to the (future) LLM interface.
## Out of Scope
* The LLM Chat UI itself (connecting these to a visual chat window comes later).
* Complex git merges (simple commands only).

View File

@@ -0,0 +1,22 @@
# Story: The Agent Brain (Ollama Integration)
## User Story
**As a** User
**I want to** connect the Assistant to a local Ollama instance
**So that** I can chat with the Agent and have it execute tools without sending data to the cloud.
## Acceptance Criteria
* [ ] Backend: Implement `ModelProvider` trait/interface.
* [ ] Backend: Implement `OllamaProvider` (POST /api/chat).
* [ ] Backend: Implement `chat(message, history, provider_config)` command.
* [ ] Must support passing Tool Definitions to Ollama (if model supports it) or System Prompt instructions.
* [ ] Must parse Tool Calls from the response.
* [ ] Frontend: Settings Screen to toggle "Ollama" and set Model Name (default: `llama3`).
* [ ] Frontend: Chat Interface.
* [ ] Message History (User/Assistant).
* [ ] Tool Call visualization (e.g., "Running git status...").
## Out of Scope
* Remote Providers (Anthropic/OpenAI) - Future Story.
* Streaming responses (wait for full completion for MVP).
* Complex context window management (just send full history for now).

View File

@@ -0,0 +1,17 @@
# Story: Ollama Model Detection
## User Story
**As a** User
**I want to** select my Ollama model from a dropdown list of installed models
**So that** I don't have to manually type (and potentially mistype) the model names.
## Acceptance Criteria
* [ ] Backend: Implement `get_ollama_models()` command.
* [ ] Call `GET /api/tags` on the Ollama instance.
* [ ] Parse the JSON response to extracting model names.
* [ ] Frontend: Replace the "Ollama Model" text input with a `<select>` dropdown.
* [ ] Frontend: Populate the dropdown on load.
* [ ] Frontend: Handle connection errors gracefully (if Ollama isn't running, show empty or error).
## Out of Scope
* Downloading new models via the UI (pulling).

View File

@@ -0,0 +1,16 @@
# Story: Persist Project Selection
## User Story
**As a** User
**I want** the application to remember the last project I opened
**So that** I don't have to re-select the directory every time I restart the app.
## Acceptance Criteria
* [ ] Backend: Use `tauri-plugin-store` (or simple JSON file) to persist `last_project_path`.
* [ ] Backend: On app startup, check if a saved path exists.
* [ ] Backend: If saved path exists and is valid, automatically load it into `SessionState`.
* [ ] Frontend: On load, check if backend has a project ready. If so, skip selection screen.
* [ ] Frontend: Add a "Close Project" button to clear the state and return to selection screen.
## Out of Scope
* Managing a list of "Recent Projects" (just the last one is fine for now).

View File

@@ -0,0 +1,19 @@
# Story: Fix UI Responsiveness (Tech Debt)
## User Story
**As a** User
**I want** the UI to remain interactive and responsive while the Agent is thinking or executing tools
**So that** I don't feel like the application has crashed.
## Context
Currently, the UI locks up or becomes unresponsive during long LLM generations or tool executions. Even though the backend commands are async, the frontend experience degrades.
## Acceptance Criteria
* [ ] Investigate the root cause of the freezing (JS Main Thread blocking vs. Tauri IPC blocking).
* [ ] Implement a "Streaming" architecture for Chat if necessary (getting partial tokens instead of waiting for full response).
* *Note: This might overlap with future streaming stories, but basic responsiveness is the priority here.*
* [ ] Add visual indicators (Spinner/Progress Bar) that animate smoothly during the wait.
* [ ] Ensure the "Stop Generation" button (if added) can actually interrupt the backend task.
## Out of Scope
* Full streaming text (unless that is the only way to fix the freezing).

View File

@@ -0,0 +1,17 @@
# Story: UI Polish - Sticky Header & Compact Layout
## User Story
**As a** User
**I want** key controls (Model Selection, Tool Toggle, Project Path) to be visible at all times
**So that** I don't have to scroll up to check my configuration or change settings.
## Acceptance Criteria
* [ ] Frontend: Create a fixed `<Header />` component at the top of the viewport.
* [ ] Frontend: Move "Active Project" display into this header (make it compact/truncated if long).
* [ ] Frontend: Move "Ollama Model" and "Enable Tools" controls into this header.
* [ ] Frontend: Ensure the Chat message list scrolls *under* the header (taking up remaining height).
* [ ] Frontend: Remove the redundant "Active Project" bar from the main workspace area.
## Out of Scope
* Full visual redesign (just layout fixing).
* Settings modal (keep controls inline for now).

View File

@@ -0,0 +1,25 @@
# Story: Collapsible Tool Outputs
## User Story
**As a** User
**I want** tool outputs (like long file contents or search results) to be collapsed by default
**So that** the chat history remains readable and I can focus on the Agent's reasoning.
## Acceptance Criteria
* [x] Frontend: Render tool outputs inside a `<details>` / `<summary>` component (or custom equivalent).
* [x] Frontend: Default state should be **Closed/Collapsed**.
* [x] Frontend: The summary line should show the Tool Name + minimal args (e.g., "▶ read_file(src/main.rs)").
* [x] Frontend: Clicking the arrow/summary expands to show the full output.
## Out of Scope
* Complex syntax highlighting for tool outputs (plain text/pre is fine).
## Implementation Plan
1. Create a reusable component for displaying tool outputs with collapsible functionality
2. Update the chat message rendering logic to use this component for tool outputs
3. Ensure the summary line displays tool name and minimal arguments
4. Verify that the component maintains proper styling and readability
5. Test expand/collapse functionality across different tool output types
## Related Functional Specs
* Functional Spec: Tool Outputs

View File

@@ -0,0 +1,27 @@
# Story: Remove Unnecessary Scroll Bars
## User Story
**As a** User
**I want** the UI to have clean, minimal scrolling without visible scroll bars
**So that** the interface looks polished and doesn't have distracting visual clutter.
## Acceptance Criteria
* [x] Remove or hide the vertical scroll bar on the right side of the chat area
* [x] Remove or hide any horizontal scroll bars that appear
* [x] Maintain scrolling functionality (content should still be scrollable, just without visible bars)
* [x] Consider using overlay scroll bars or auto-hiding scroll bars for better aesthetics
* [x] Ensure the solution works across different browsers (Chrome, Firefox, Safari)
* [x] Verify that long messages and tool outputs still scroll properly
## Out of Scope
* Custom scroll bar designs with fancy styling
* Touch/gesture scrolling improvements for mobile (desktop focus for now)
## Implementation Notes
* Use CSS `scrollbar-width: none` for Firefox
* Use `::-webkit-scrollbar { display: none; }` for Chrome/Safari
* Ensure `overflow: auto` or `overflow-y: scroll` is still applied to maintain scroll functionality
* Test with long tool outputs and chat histories to ensure no layout breaking
## Related Functional Specs
* Functional Spec: UI/UX

View File

@@ -0,0 +1,18 @@
# Story: System Prompt & Persona
## User Story
**As a** User
**I want** the Agent to behave like a Senior Engineer and know exactly how to use its tools
**So that** it writes high-quality code and doesn't hallucinate capabilities or refuse to edit files.
## Acceptance Criteria
* [ ] Backend: Define a robust System Prompt constant (likely in `src-tauri/src/llm/prompts.rs`).
* [ ] Content: The prompt should define:
* Role: "Senior Software Engineer / Agent".
* Tone: Professional, direct, no fluff.
* Tool usage instructions: "You have access to the local filesystem. Use `read_file` to inspect context before editing."
* Workflow: "When asked to implement a feature, read relevant files first, then write."
* [ ] Backend: Inject this system message at the *start* of every `chat` session sent to the Provider.
## Out of Scope
* User-editable system prompts (future story).

View File

@@ -0,0 +1,15 @@
# Story: Persist Model Selection
## User Story
**As a** User
**I want** the application to remember which LLM model I selected
**So that** I don't have to switch from "llama3" to "deepseek" every time I launch the app.
## Acceptance Criteria
* [ ] Backend/Frontend: Use `tauri-plugin-store` to save the `selected_model` string.
* [ ] Frontend: On mount (after fetching available models), check the store.
* [ ] Frontend: If the stored model exists in the available list, select it.
* [ ] Frontend: When the user changes the dropdown, update the store.
## Out of Scope
* Persisting per-project model settings (global setting is fine for now).

View File

@@ -0,0 +1,40 @@
# Story: Left-Align Chat Text and Add Syntax Highlighting
## User Story
**As a** User
**I want** chat messages and code to be left-aligned instead of centered, with proper syntax highlighting for code blocks
**So that** the text is more readable, follows standard chat UI conventions, and code is easier to understand.
## Acceptance Criteria
* [x] User messages should be right-aligned (standard chat pattern)
* [x] Assistant messages should be left-aligned
* [x] Tool outputs should be left-aligned
* [x] Code blocks and monospace text should be left-aligned
* [x] Remove any center-alignment styling from the chat container
* [x] Maintain the current max-width constraint for readability
* [x] Ensure proper spacing and padding for visual hierarchy
* [x] Add syntax highlighting for code blocks in assistant messages
* [x] Support common languages: JavaScript, TypeScript, Rust, Python, JSON, Markdown, Shell, etc.
* [x] Syntax highlighting should work with the dark theme
## Out of Scope
* Redesigning the entire chat layout
* Adding avatars or profile pictures
* Changing the overall color scheme or theme (syntax highlighting colors should complement existing dark theme)
* Custom themes for syntax highlighting
## Implementation Notes
* Check `Chat.tsx` for any `textAlign: "center"` styles
* Check `App.css` for any center-alignment rules affecting the chat
* User messages should align to the right with appropriate styling
* Assistant and tool messages should align to the left
* Code blocks should always be left-aligned for readability
* For syntax highlighting, consider using:
* `react-syntax-highlighter` (works with react-markdown)
* Or `prism-react-renderer` for lighter bundle size
* Or integrate with `rehype-highlight` plugin for react-markdown
* Use a dark theme preset like `oneDark`, `vsDark`, or `dracula`
* Syntax highlighting should be applied to markdown code blocks automatically
## Related Functional Specs
* Functional Spec: UI/UX

View File

@@ -0,0 +1,117 @@
# Story 12: Be Able to Use Claude
## User Story
As a user, I want to be able to select Claude (via Anthropic API) as my LLM provider so I can use Claude models instead of only local Ollama models.
## Acceptance Criteria
- [x] Claude models appear in the unified model dropdown (same dropdown as Ollama models)
- [x] Dropdown is organized with section headers: "Anthropic" and "Ollama" with models listed under each
- [x] When user first selects a Claude model, a dialog prompts for Anthropic API key
- [x] API key is stored securely (using Tauri store plugin for reliable cross-platform storage)
- [x] Provider is auto-detected from model name (starts with `claude-` = Anthropic, otherwise = Ollama)
- [x] Chat requests route to Anthropic API when Claude model is selected
- [x] Streaming responses work with Claude (token-by-token display)
- [x] Tool calling works with Claude (using Anthropic's tool format)
- [x] Context window calculation accounts for Claude models (200k tokens)
- [x] User's model selection persists between sessions
- [x] Clear error messages if API key is missing or invalid
## Out of Scope
- Support for other providers (OpenAI, Google, etc.) - can be added later
- API key management UI (rotation, multiple keys, view/edit key after initial entry)
- Cost tracking or usage monitoring
- Model fine-tuning or custom models
- Switching models mid-conversation (user can start new session)
- Fetching available Claude models from API (hardcoded list is fine)
## Technical Notes
- Anthropic API endpoint: `https://api.anthropic.com/v1/messages`
- API key should be stored securely (environment variable or secure storage)
- Claude models support tool use (function calling)
- Context windows: claude-3-5-sonnet (200k), claude-3-5-haiku (200k)
- Streaming uses Server-Sent Events (SSE)
- Tool format differs from OpenAI/Ollama - needs conversion
## Design Considerations
- Single unified model dropdown with section headers ("Anthropic", "Ollama")
- Use `<optgroup>` in HTML select for visual grouping
- API key dialog appears on-demand (first use of Claude model)
- Store API key in OS keychain using `keyring` crate (cross-platform)
- Backend auto-detects provider from model name pattern
- Handle API key in backend only (don't expose to frontend logs)
- Alphabetical sorting within each provider section
## Implementation Approach
### Backend (Rust)
1. Add `anthropic` feature/module for Claude API client
2. Create `AnthropicClient` with streaming support
3. Convert tool definitions to Anthropic format
4. Handle Anthropic streaming response format
5. Add API key storage (encrypted or environment variable)
### Frontend (TypeScript)
1. Add hardcoded list of Claude models (claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022)
2. Merge Ollama and Claude models into single dropdown with `<optgroup>` sections
3. Create API key input dialog/modal component
4. Trigger API key dialog when Claude model selected and no key stored
5. Add Tauri command to check if API key exists in keychain
6. Add Tauri command to set API key in keychain
7. Update context window calculations for Claude models (200k tokens)
### API Differences
- Anthropic uses `messages` array format (similar to OpenAI)
- Tools are called `tools` with different schema
- Streaming events have different structure
- Need to map our tool format to Anthropic's format
## Security Considerations
- API key stored in OS keychain (not in files or environment variables)
- Use `keyring` crate for cross-platform secure storage
- Never log API key in console or files
- Backend validates API key format before making requests
- Handle API errors gracefully (rate limits, invalid key, network errors)
- API key only accessible to the app process
## UI Flow
1. User opens model dropdown → sees "Anthropic" section with Claude models, "Ollama" section with local models
2. User selects `claude-3-5-sonnet-20241022`
3. Backend checks Tauri store for saved API key
4. If not found → Frontend shows dialog: "Enter your Anthropic API key"
5. User enters key → Backend stores in Tauri store (persistent JSON file)
6. Chat proceeds with Anthropic API
7. Future sessions: API key auto-loaded from store (no prompt)
## Implementation Notes (Completed)
### Storage Solution
Initially attempted to use the `keyring` crate for OS keychain integration, but encountered issues in macOS development mode:
- Unsigned Tauri apps in dev mode cannot reliably access the system keychain
- The `keyring` crate reported successful saves but keys were not persisting
- No macOS keychain permission dialogs appeared
**Solution:** Switched to Tauri's `store` plugin (`tauri-plugin-store`)
- Provides reliable cross-platform persistent storage
- Stores data in a JSON file managed by Tauri
- Works consistently in both development and production builds
- Simpler implementation without platform-specific entitlements
### Key Files Modified
- `src-tauri/src/commands/chat.rs`: API key storage/retrieval using Tauri store
- `src/components/Chat.tsx`: API key dialog and flow with pending message preservation
- `src-tauri/Cargo.toml`: Removed `keyring` dependency, kept `tauri-plugin-store`
- `src-tauri/src/llm/anthropic.rs`: Anthropic API client with streaming support
### Frontend Implementation
- Added `pendingMessageRef` to preserve user's message when API key dialog is shown
- Modified `sendMessage()` to accept optional message parameter for retry scenarios
- API key dialog appears on first Claude model usage
- After saving key, automatically retries sending the pending message
### Backend Implementation
- `get_anthropic_api_key_exists()`: Checks if API key exists in store
- `set_anthropic_api_key()`: Saves API key to store with verification
- `get_anthropic_api_key()`: Retrieves API key for Anthropic API calls
- Provider auto-detection based on `claude-` model name prefix
- Tool format conversion from internal format to Anthropic's schema
- SSE streaming implementation for real-time token display

View File

@@ -0,0 +1,82 @@
# Story 13: Stop Button
## User Story
**As a** User
**I want** a Stop button to cancel the model's response while it's generating
**So that** I can immediately stop long-running or unwanted responses without waiting for completion
## The Problem
**Current Behavior:**
- User sends message → Model starts generating
- User realizes they don't want the response (wrong question, too long, etc.)
- **No way to stop it** - must wait for completion
- Tool calls will execute even if user wants to cancel
**Why This Matters:**
- Long responses waste time
- Tool calls have side effects (file writes, searches, shell commands)
- User has no control once generation starts
- Standard UX pattern in ChatGPT, Claude, etc.
## Acceptance Criteria
- [ ] Stop button (⬛) appears in place of Send button (↑) while model is generating
- [ ] Clicking Stop immediately cancels the backend request
- [ ] Tool calls that haven't started yet are NOT executed after cancellation
- [ ] Streaming stops immediately
- [ ] Partial response generated before stopping remains visible in chat
- [ ] Stop button becomes Send button again after cancellation
- [ ] User can immediately send a new message after stopping
- [ ] Input field remains enabled during generation
## Out of Scope
- Escape key shortcut (can add later)
- Confirmation dialog (immediate action is better UX)
- Undo/redo functionality
- New Session flow (that's Story 14)
## Implementation Approach
### Backend
- Add `cancel_chat` command callable from frontend
- Use `tokio::select!` to race chat execution vs cancellation signal
- Check cancellation before executing each tool
- Return early when cancelled (not an error - expected behavior)
### Frontend
- Replace Send button with Stop button when `loading` is true
- On Stop click: call `invoke("cancel_chat")` and set `loading = false`
- Keep input enabled during generation
- Visual: Make Stop button clearly distinct (⬛ or "Stop" text)
## Testing Strategy
1. **Test Stop During Streaming:**
- Send message requesting long response
- Click Stop while streaming
- Verify streaming stops immediately
- Verify partial response remains visible
- Verify can send new message
2. **Test Stop Before Tool Execution:**
- Send message that will use tools
- Click Stop while "thinking" (before tool executes)
- Verify tool does NOT execute (check logs/filesystem)
3. **Test Stop During Tool Execution:**
- Send message with multiple tool calls
- Click Stop after first tool executes
- Verify remaining tools do NOT execute
## Success Criteria
**Before:**
- User sends message → No way to stop → Must wait for completion → Frustrating UX
**After:**
- User sends message → Stop button appears → User clicks Stop → Generation cancels immediately → Partial response stays → Can send new message
## Related Stories
- Story 14: New Session Cancellation (same backend mechanism, different trigger)
- Story 18: Streaming Responses (Stop must work with streaming)

View File

@@ -0,0 +1,27 @@
# Story: Auto-focus Chat Input on Startup
## User Story
**As a** User
**I want** the cursor to automatically appear in the chat input box when the app starts
**So that** I can immediately start typing without having to click into the input field first.
## Acceptance Criteria
* [x] When the app loads and a project is selected, the chat input box should automatically receive focus
* [x] The cursor should be visible and blinking in the input field
* [x] User can immediately start typing without any additional clicks
* [x] Focus should be set after the component mounts
* [x] Should not interfere with other UI interactions
## Out of Scope
* Auto-focus when switching between projects (only on initial load)
* Remembering cursor position across sessions
* Focus management for other input fields
## Implementation Notes
* Use React `useEffect` hook to set focus on component mount
* Use a ref to reference the input element
* Call `inputRef.current?.focus()` after component renders
* Ensure it works consistently across different browsers
## Related Functional Specs
* Functional Spec: UI/UX

View File

@@ -0,0 +1,99 @@
# Story 14: New Session Cancellation
## User Story
**As a** User
**I want** the backend to stop processing when I start a new session
**So that** tools don't silently execute in the background and streaming doesn't leak into my new session
## The Problem
**Current Behavior (THE BUG):**
1. User sends message → Backend starts streaming → About to execute a tool (e.g., `write_file`)
2. User clicks "New Session" and confirms
3. Frontend clears messages and UI state
4. **Backend keeps running** → Tool executes → File gets written → Streaming continues
5. **Streaming tokens appear in the new session**
6. User has no idea these side effects occurred in the background
**Why This Is Critical:**
- Tool calls have real side effects (file writes, shell commands, searches)
- These happen silently after user thinks they've started fresh
- Streaming from old session leaks into new session
- Can cause confusion, data corruption, or unexpected system state
- User expects "New Session" to mean a clean slate
## Acceptance Criteria
- [ ] Clicking "New Session" and confirming cancels any in-flight backend request
- [ ] Tool calls that haven't started yet are NOT executed
- [ ] Streaming from old request does NOT appear in new session
- [ ] Backend stops processing immediately when cancellation is triggered
- [ ] New session starts with completely clean state
- [ ] No silent side effects in background after new session starts
## Out of Scope
- Stop button during generation (that's Story 13)
- Improving the confirmation dialog (already done in Story 20)
- Rolling back already-executed tools (partial work stays)
## Implementation Approach
### Backend
- Uses same `cancel_chat` command as Story 13
- Same cancellation mechanism (tokio::select!, watch channel)
### Frontend
- Call `invoke("cancel_chat")` BEFORE clearing UI state in `clearSession()`
- Wait for cancellation to complete before clearing messages
- Ensure old streaming events don't arrive after clear
## Testing Strategy
1. **Test Tool Call Prevention:**
- Send message that will use tools (e.g., "search all TypeScript files")
- Click "New Session" while it's thinking
- Confirm in dialog
- Verify tool does NOT execute (check logs/filesystem)
- Verify new session is clean
2. **Test Streaming Leak Prevention:**
- Send message requesting long response
- While streaming, click "New Session" and confirm
- Verify old streaming stops immediately
- Verify NO tokens from old request appear in new session
- Type new message and verify only new response appears
3. **Test File Write Prevention:**
- Ask to write a file: "Create test.txt with current timestamp"
- Click "New Session" before tool executes
- Check filesystem: test.txt should NOT exist
- Verify no background file creation happens
## Success Criteria
**Before (BROKEN):**
```
User: "Search files and write results.txt"
Backend: Starts streaming...
User: *clicks New Session, confirms*
Frontend: Clears UI ✓
Backend: Still running... executes search... writes file... ✗
Result: File written silently in background ✗
Old streaming tokens appear in new session ✗
```
**After (FIXED):**
```
User: "Search files and write results.txt"
Backend: Starts streaming...
User: *clicks New Session, confirms*
Frontend: Calls cancel_chat, waits, then clears UI ✓
Backend: Receives cancellation, stops immediately ✓
Backend: Tools NOT executed ✓
Result: Clean new session, no background activity ✓
```
## Related Stories
- Story 13: Stop Button (shares same backend cancellation mechanism)
- Story 20: New Session confirmation dialog (UX for triggering this)
- Story 18: Streaming Responses (must not leak between sessions)

View File

@@ -0,0 +1,82 @@
# Story 17: Display Context Window Usage
## User Story
As a user, I want to see how much of the model's context window I'm currently using, so that I know when I'm approaching the limit and should start a new session to avoid losing conversation quality.
## Acceptance Criteria
- [x] A visual indicator shows the current context usage (e.g., "2.5K / 8K tokens" or percentage)
- [x] The indicator is always visible in the UI (header area recommended)
- [x] The display updates in real-time as messages are added
- [x] Different models show their appropriate context window size (e.g., 8K for llama3.1, 128K for larger models)
- [x] The indicator changes color or style when approaching the limit (e.g., yellow at 75%, red at 90%)
- [x] Hovering over the indicator shows more details (tokens per message breakdown - optional)
- [x] The calculation includes system prompts, user messages, assistant responses, and tool outputs
- [x] Token counting is reasonably accurate (doesn't need to be perfect, estimate is fine)
## Out of Scope
- Exact token counting (approximation is acceptable)
- Automatic session clearing when limit reached
- Per-message token counts in the UI
- Token usage history or analytics
- Different tokenizers for different models (use one estimation method)
- Backend token tracking from Ollama (estimate on frontend)
## Technical Notes
### Token Estimation
- Simple approximation: 1 token ≈ 4 characters (English text)
- Or use a basic tokenizer library like `gpt-tokenizer` or `tiktoken` (JS port)
- Count all message content: system prompts + user messages + assistant responses + tool outputs
- Include tool call JSON in the count
### Context Window Sizes
Common model context windows:
- llama3.1, llama3.2: 8K tokens (8,192)
- qwen2.5-coder: 32K tokens
- deepseek-coder: 16K tokens
- Default/unknown: 8K tokens
### Implementation Approach
```tsx
// Simple character-based estimation
const estimateTokens = (text: string): number => {
return Math.ceil(text.length / 4);
};
const calculateTotalTokens = (messages: Message[]): number => {
let total = 0;
// Add system prompt tokens (from backend)
total += estimateTokens(SYSTEM_PROMPT);
// Add all message tokens
for (const msg of messages) {
total += estimateTokens(msg.content);
if (msg.tool_calls) {
total += estimateTokens(JSON.stringify(msg.tool_calls));
}
}
return total;
};
```
### UI Placement
- Header area, right side near model selector
- Format: "2.5K / 8K tokens (31%)"
- Color coding:
- Green/default: 0-74%
- Yellow/warning: 75-89%
- Red/danger: 90-100%
## Design Considerations
- Keep it subtle and non-intrusive
- Should be informative but not alarming
- Consider a small progress bar or circular indicator
- Example: "📊 2,450 / 8,192 (30%)"
- Or icon-based: "🟢 30% context"
## Future Enhancements (Not in this story)
- Backend token counting from Ollama (if available)
- Per-message token display on hover
- "Summarize and continue" feature to compress history
- Export/archive conversation before clearing

View File

@@ -0,0 +1,28 @@
# Story 18: Token-by-Token Streaming Responses
## User Story
As a user, I want to see the AI's response appear token-by-token in real-time (like ChatGPT), so that I get immediate feedback and know the system is working, rather than waiting for the entire response to appear at once.
## Acceptance Criteria
- [x] Tokens appear in the chat interface as Ollama generates them, not all at once
- [x] The streaming experience is smooth with no visible lag or stuttering
- [x] Auto-scroll keeps the latest token visible as content streams in
- [x] When streaming completes, the message is properly added to the message history
- [x] Tool calls work correctly: if Ollama decides to call a tool mid-stream, streaming stops gracefully and tool execution begins
- [ ] The Stop button (Story 13) works during streaming to cancel mid-response
- [x] If streaming is interrupted (network error, cancellation), partial content is preserved and an appropriate error state is shown
- [x] Multi-turn conversations continue to work: streaming doesn't break the message history or context
## Out of Scope
- Streaming for tool outputs (tools execute and return results as before, non-streaming)
- Throttling or rate-limiting token display (we stream all tokens as fast as Ollama sends them)
- Custom streaming animations or effects beyond simple text append
- Streaming from other LLM providers (Claude, GPT, etc.) - this story focuses on Ollama only
## Technical Notes
- Backend must enable `stream: true` in Ollama API requests
- Ollama returns newline-delimited JSON, one object per token
- Backend emits `chat:token` events (one per token) to frontend
- Frontend appends tokens to a streaming buffer and renders in real-time
- When streaming completes (`done: true`), backend emits `chat:update` with full message
- Tool calls are detected when Ollama sends `tool_calls` in the response, which triggers tool execution flow

View File

@@ -0,0 +1,39 @@
# Story 20: Start New Session / Clear Chat History
## User Story
As a user, I want to be able to start a fresh conversation without restarting the entire application, so that I can begin a new task with completely clean context (both frontend and backend) while keeping the same project open.
## Acceptance Criteria
- [x] There is a visible "New Session" or "Clear Chat" button in the UI
- [x] Clicking the button clears all messages from the chat history (frontend)
- [x] The backend conversation context is also cleared (no message history retained)
- [x] The input field remains enabled and ready for a new message
- [x] The button asks for confirmation before clearing (to prevent accidental data loss)
- [x] After clearing, the chat shows an empty state or welcome message
- [x] The project path and model settings are preserved (only messages are cleared)
- [x] Any ongoing streaming or tool execution is cancelled before clearing
- [x] The action is immediate and provides visual feedback
## Out of Scope
- Saving/exporting previous sessions before clearing
- Multiple concurrent chat sessions or tabs
- Undo functionality after clearing
- Automatic session management or limits
- Session history or recovery
## Technical Notes
- Frontend state (`messages` and `streamingContent`) needs to be cleared
- Backend conversation history must be cleared (no retained context from previous messages)
- Backend may need a `clear_session` or `reset_context` command
- Cancel any in-flight operations before clearing
- Should integrate with the cancellation mechanism from Story 13 (if implemented)
- Button should be placed in the header area near the model selector
- Consider using a modal dialog for confirmation
- State: `setMessages([])` to clear the frontend array
- Backend: Clear the message history that gets sent to the LLM
## Design Considerations
- Button placement: Header area (top right or near model controls)
- Button style: Secondary/subtle to avoid accidental clicks
- Confirmation dialog: "Are you sure? This will clear all messages and reset the conversation context."
- Icon suggestion: 🔄 or "New" text label

View File

@@ -0,0 +1,48 @@
# Story 22: Smart Auto-Scroll (Respects User Scrolling)
## User Story
As a user, I want to be able to scroll up to review previous messages while the AI is streaming or adding new content, without being constantly dragged back to the bottom.
## Acceptance Criteria
- [x] When I scroll up in the chat, auto-scroll is temporarily disabled
- [x] Auto-scroll resumes when I scroll back to (or near) the bottom
- [ ] There's a visual indicator when auto-scroll is paused (optional)
- [ ] Clicking a "Jump to Bottom" button (if added) re-enables auto-scroll
- [x] Auto-scroll works normally when I'm already at the bottom
- [x] The detection works smoothly without flickering
- [x] Works during both streaming responses and tool execution
## Out of Scope
- Manual scroll position restoration after page refresh
- Scroll position memory across sessions
- Keyboard shortcuts for scrolling
- Custom scroll speed or animation settings
## Technical Notes
- Detect if user is scrolled to bottom: `scrollHeight - scrollTop === clientHeight` (with small threshold)
- Only auto-scroll if user is at/near bottom (e.g., within 100px)
- Track scroll position in state or ref
- Add scroll event listener to detect when user manually scrolls
- Consider debouncing the scroll detection for performance
## Design Considerations
- Threshold for "near bottom": 100-150px is typical
- Optional: Show a "↓ New messages" badge when auto-scroll is paused
- Should feel natural and not interfere with reading
- Balance between auto-scroll convenience and user control
## Implementation Approach
```tsx
const isScrolledToBottom = () => {
const element = scrollContainerRef.current;
if (!element) return true;
const threshold = 150; // pixels from bottom
return element.scrollHeight - element.scrollTop - element.clientHeight < threshold;
};
useEffect(() => {
if (isScrolledToBottom()) {
scrollToBottom();
}
}, [messages, streamingContent]);
```

View File

@@ -0,0 +1,36 @@
# Story 23: Alphabetize LLM Dropdown List
## User Story
As a user, I want the LLM model dropdown to be alphabetically sorted so I can quickly find the model I'm looking for.
## Acceptance Criteria
- [x] The model dropdown list is sorted alphabetically (case-insensitive)
- [x] The currently selected model remains selected after sorting
- [x] The sorting works for all models returned from Ollama
- [x] The sorted list updates correctly when models are added/removed
## Out of Scope
- Grouping models by type or provider
- Custom sort orders (e.g., by popularity, recency)
- Search/filter functionality in the dropdown
- Favoriting or pinning specific models to the top
## Technical Notes
- Models are fetched from `get_ollama_models` Tauri command
- Currently displayed in the order returned by the backend
- Sort should be case-insensitive (e.g., "Llama" and "llama" treated equally)
- JavaScript's `sort()` with `localeCompare()` is ideal for this
## Implementation Approach
```tsx
// After fetching models from backend
const sortedModels = models.sort((a, b) =>
a.toLowerCase().localeCompare(b.toLowerCase())
);
setAvailableModels(sortedModels);
```
## Design Considerations
- Keep it simple - alphabetical order is intuitive
- Case-insensitive to handle inconsistent model naming
- No need to change backend - sorting on frontend is sufficient

View File

@@ -0,0 +1,23 @@
# Story 01: Replace Tauri with Browser UI Served by Rust Binary
## User Story
As a user, I want to run a single Rust binary that serves the web UI and exposes a WebSocket API, so I can use the app in my browser without installing a desktop shell.
## Acceptance Criteria
- The app runs as a single Rust binary that:
- Serves the built frontend assets from a `frontend` directory.
- Exposes a WebSocket endpoint for chat streaming and tool execution.
- The browser UI uses the WebSocket API for:
- Sending chat messages.
- Receiving streaming token updates and final chat history updates.
- Requesting file operations, search, and shell execution.
- The project selection UI uses a browser file picker (not native OS dialogs).
- Model preference and last project selection are persisted server-side (no Tauri store).
- The Tauri backend and configuration are removed from the build pipeline.
- The frontend remains a Vite/React build and is served as static assets by the Rust binary.
## Out of Scope
- Reworking the LLM provider implementations beyond wiring changes.
- Changing the UI layout/visual design.
- Adding authentication or multi-user support.
- Switching away from Vite for frontend builds.

View File

@@ -0,0 +1,24 @@
# Story 25: Auto-Scaffold Story Kit Metadata on New Projects
## User Story
As a user, I want the app to automatically scaffold the `.story_kit` directory when I open a path that doesn't exist, so new projects are ready for the Story Kit workflow immediately.
## Acceptance Criteria
- When I enter a non-existent project path and press Enter/Open, the app creates the directory.
- The app also creates the `.story_kit` directory under the new project root.
- The `.story_kit` structure includes:
- `README.md` (the Story Kit workflow instructions)
- `specs/`
- `README.md`
- `00_CONTEXT.md`
- `tech/STACK.md`
- `functional/` (created, even if empty)
- `stories/`
- `archive/`
- The project opens successfully after scaffolding completes.
- If any scaffolding step fails, the UI shows a clear error message and does not open the project.
## Out of Scope
- Creating any `src/` files or application code.
- Populating project-specific content beyond the standard Story Kit templates.
- Prompting the user for metadata (e.g., project name, description, stack choices).