# Functional Spec: UI/UX Responsiveness

## Problem
Currently, the `chat` command in Rust is an async function that performs a long-running, blocking loop (waiting for LLM, executing tools). While Tauri executes this on a separate thread from the UI, the frontend awaits the *entire* result before re-rendering. This makes the app feel "frozen" because there is no feedback during the 10-60 seconds of generation.

## Solution: Event-Driven Feedback
Instead of waiting for the final array of messages, the Backend should emit **Events** to the Frontend in real-time.

### 1. Events
*   `chat:token`: Emitted when a text token is generated (Streaming text).
*   `chat:tool-start`: Emitted when a tool call begins (e.g., `{ tool: "git status" }`).
*   `chat:tool-end`: Emitted when a tool call finishes (e.g., `{ output: "..." }`).

### 2. Implementation Strategy

#### Token-by-Token Streaming (Story 18)
The system now implements full token streaming for real-time response display:

*   **Backend (Rust):**
    *   Set `stream: true` in Ollama API requests
    *   Parse newline-delimited JSON from Ollama's streaming response
    *   Emit `chat:token` events for each token received
    *   Use `reqwest` streaming body with async iteration
    *   After streaming completes, emit `chat:update` with the full message
    
*   **Frontend (TypeScript):**
    *   Listen for `chat:token` events
    *   Append tokens to the current assistant message in real-time
    *   Maintain smooth auto-scroll as tokens arrive
    *   After streaming completes, process `chat:update` for final state

*   **Event-Driven Updates:**
    *   `chat:token`: Emitted for each token during streaming (payload: `{ content: string }`)
    *   `chat:update`: Emitted after LLM response complete or after Tool Execution (payload: `Message[]`)
    *   Frontend maintains streaming state separate from message history

### 3. Visuals
*   **Loading State:** The "Send" button should show a spinner or "Stop" button.
*   **Auto-Scroll:** The chat view should stick to the bottom as new events arrive.

## Tool Output Display

### Problem
Tool outputs (like file contents, search results, or command output) can be very long, making the chat history difficult to read. Users need to see the Agent's reasoning and responses without being overwhelmed by verbose tool output.

### Solution: Collapsible Tool Outputs
Tool outputs should be rendered in a collapsible component that is **closed by default**.

### Requirements

1. **Default State:** Tool outputs are collapsed/closed when first rendered
2. **Summary Line:** Shows essential information without expanding:
   - Tool name (e.g., `read_file`, `exec_shell`)
   - Key arguments (e.g., file path, command name)
   - Format: "▶ tool_name(key_arg)"
   - Example: "▶ read_file(src/main.rs)"
   - Example: "▶ exec_shell(cargo check)"
3. **Expandable:** User can click the summary to toggle expansion
4. **Output Display:** When expanded, shows the complete tool output in a readable format:
   - Use `<pre>` or monospace font for code/terminal output
   - Preserve whitespace and line breaks
   - Limit height with scrolling for very long outputs (e.g., max-height: 300px)
5. **Visual Indicator:** Clear arrow or icon showing collapsed/expanded state
6. **Styling:** Consistent with the dark theme, distinguishable from assistant messages

### Implementation Notes
*   Use native `<details>` and `<summary>` HTML elements for accessibility
*   Or implement custom collapsible component with proper ARIA attributes
*   Tool outputs should be visually distinct (border, background color, or badge)
*   Multiple tool calls in sequence should each be independently collapsible

## Scroll Bar Styling

### Problem
Visible scroll bars create visual clutter and make the interface feel less polished. Standard browser scroll bars can be distracting and break the clean aesthetic of the dark theme.

### Solution: Hidden Scroll Bars with Maintained Functionality
Scroll bars should be hidden while maintaining full scroll functionality.

### Requirements

1. **Visual:** Scroll bars should not be visible to the user
2. **Functionality:** Scrolling must still work perfectly:
   - Mouse wheel scrolling
   - Trackpad scrolling
   - Keyboard navigation (arrow keys, page up/down)
   - Auto-scroll to bottom for new messages
3. **Cross-browser:** Solution must work on Chrome, Firefox, and Safari
4. **Areas affected:**
   - Main chat message area (vertical scroll)
   - Tool output content (both vertical and horizontal)
   - Any other scrollable containers

### Implementation Notes
*   Use CSS `scrollbar-width: none` for Firefox
*   Use `::-webkit-scrollbar { display: none; }` for Chrome/Safari/Edge
*   Maintain `overflow: auto` or `overflow-y: scroll` to preserve scroll functionality
*   Ensure `overflow-x: hidden` where horizontal scroll is not needed
*   Test with very long messages and large tool outputs to ensure no layout breaking

## Text Alignment and Readability

### Problem
Center-aligned text in a chat interface is unconventional and reduces readability, especially for code blocks and long-form content. Standard chat UIs align messages differently based on the sender.

### Solution: Context-Appropriate Text Alignment
Messages should follow standard chat UI conventions with proper alignment based on message type.

### Requirements

1. **User Messages:** Right-aligned (standard pattern showing messages sent by the user)
2. **Assistant Messages:** Left-aligned (standard pattern showing messages received)
3. **Tool Outputs:** Left-aligned (part of the system/assistant response flow)
4. **Code Blocks:** Always left-aligned regardless of message type (for readability)
5. **Container:** Remove any center-alignment from the chat container
6. **Max-Width:** Maintain current max-width constraint (e.g., 768px) for optimal readability
7. **Spacing:** Maintain proper padding and visual hierarchy between messages

### Implementation Notes
*   Check for `textAlign: "center"` in inline styles and remove
*   Check for `text-align: center` in CSS and remove from chat-related classes
*   Ensure flexbox alignment is set appropriately:
    *   User messages: `alignItems: "flex-end"`
    *   Assistant/Tool messages: `alignItems: "flex-start"`
*   Code blocks should have `text-align: left` explicitly set

## Syntax Highlighting

### Problem
Code blocks in assistant responses currently lack syntax highlighting, making them harder to read and understand. Developers expect colored syntax highlighting similar to their code editors.

### Solution: Syntax Highlighting for Code Blocks
Integrate syntax highlighting into markdown code blocks rendered by the assistant.

### Requirements

1. **Languages Supported:** At minimum:
   - JavaScript/TypeScript
   - Rust
   - Python
   - JSON
   - Markdown
   - Shell/Bash
   - HTML/CSS
   - SQL
2. **Theme:** Use a dark theme that complements the existing dark UI (e.g., `oneDark`, `vsDark`, `dracula`)
3. **Integration:** Work seamlessly with `react-markdown` component
4. **Performance:** Should not significantly impact rendering performance
5. **Fallback:** Plain monospace text for unrecognized languages
6. **Inline Code:** Inline code (single backticks) should maintain simple styling without full syntax highlighting

### Implementation Notes
*   Use `react-syntax-highlighter` library with `react-markdown`
*   Or use `rehype-highlight` plugin for `react-markdown`
*   Configure with a dark theme preset (e.g., `oneDark` from `react-syntax-highlighter/dist/esm/styles/prism`)
*   Apply to code blocks via `react-markdown` components prop:
    ```tsx
    <Markdown
      components={{
        code: ({node, inline, className, children, ...props}) => {
          const match = /language-(\w+)/.exec(className || '');
          return !inline && match ? (
            <SyntaxHighlighter style={oneDark} language={match[1]} {...props}>
              {String(children).replace(/\n$/, '')}
            </SyntaxHighlighter>
          ) : (
            <code className={className} {...props}>{children}</code>
          );
        }
      }}
    />
    ```
*   Ensure syntax highlighted code blocks are left-aligned
*   Test with various code samples to ensure proper rendering

## Token Streaming

### Problem
Without streaming, users see no feedback during model generation. The response appears all at once after waiting, which feels unresponsive and provides no indication that the system is working.

### Solution: Token-by-Token Streaming
Stream tokens from Ollama in real-time and display them as they arrive, providing immediate feedback and a responsive chat experience similar to ChatGPT.

### Requirements

1. **Real-time Display:** Tokens appear immediately as Ollama generates them
2. **Smooth Performance:** No lag or stuttering during high token throughput
3. **Tool Compatibility:** Streaming works correctly with tool calls and multi-turn conversations
4. **Auto-scroll:** Chat view follows streaming content automatically
5. **Error Handling:** Gracefully handle stream interruptions or errors
6. **State Management:** Maintain clean separation between streaming state and final message history

### Implementation Notes

#### Backend (Rust)
*   Enable streaming in Ollama requests: `stream: true`
*   Parse newline-delimited JSON from response body
*   Each line is a separate JSON object: `{"message":{"content":"token"},"done":false}`
*   Use `futures::StreamExt` or similar for async stream processing
*   Emit `chat:token` event for each token
*   Emit `chat:update` when streaming completes
*   Handle both streaming text and tool call interruptions

#### Frontend (TypeScript)
*   Create streaming state separate from message history
*   Listen for `chat:token` events and append to streaming buffer
*   Render streaming content in real-time
*   On `chat:update`, replace streaming content with final message
*   Maintain scroll position during streaming

#### Ollama Streaming Format
```json
{"message":{"role":"assistant","content":"Hello"},"done":false}
{"message":{"role":"assistant","content":" world"},"done":false}
{"message":{"role":"assistant","content":"!"},"done":true}
{"message":{"role":"assistant","tool_calls":[...]},"done":true}
```

### Edge Cases
*   Tool calls during streaming: Switch from text streaming to tool execution
*   Cancellation during streaming: Clean up streaming state properly
*   Network interruptions: Show error and preserve partial content
*   Very fast streaming: Throttle UI updates if needed for performance

## Input Focus Management

### Problem
When the app loads with a project selected, users need to click into the chat input box before they can start typing. This adds unnecessary friction to the user experience.

### Solution: Auto-focus on Component Mount
The chat input field should automatically receive focus when the chat component mounts, allowing users to immediately start typing.

### Requirements

1. **Auto-focus:** Input field receives focus automatically when chat component loads
2. **Visible Cursor:** Cursor should be visible and blinking in the input field
3. **Immediate Typing:** User can start typing without clicking into the field
4. **Non-intrusive:** Should not interfere with other UI interactions or accessibility
5. **Timing:** Focus should be set after the component fully mounts

### Implementation Notes
*   Use React `useRef` to create a reference to the input element
*   Use `useEffect` with empty dependency array to run once on mount
*   Call `inputRef.current?.focus()` in the effect
*   Ensure the ref is properly attached to the input element
*   Example implementation:
    ```tsx
    const inputRef = useRef<HTMLInputElement>(null);
    
    useEffect(() => {
      inputRef.current?.focus();
    }, []);
    
    return <input ref={inputRef} ... />
    ```

## Response Interruption

### Problem
Users may want to interrupt a long-running model response to ask a different question or change direction. Having to wait for the full response to complete creates friction and wastes time.

### Solution: Interrupt on Typing
When the user starts typing in the input field while the model is generating a response, the generation should be cancelled immediately, allowing the user to send a new message.

### Requirements

1. **Input Always Enabled:** The input field should remain enabled and usable even while the model is generating
2. **Interrupt Detection:** Detect when user types in the input field while `loading` state is true
3. **Immediate Cancellation:** Cancel the ongoing generation as soon as typing is detected
4. **Preserve Partial Response:** Any partial response generated before interruption should remain visible in the chat
5. **State Reset:** UI should return to normal state (ready to send) after interruption
6. **Preserve User Input:** The user's new input should be preserved in the input field
7. **Visual Feedback:** "Thinking..." indicator should disappear when generation is interrupted

### Implementation Notes
*   Do NOT disable the input field during loading
*   Listen for input changes while `loading` is true
*   When user types during loading, call backend to cancel generation (if possible) or just stop waiting
*   Set `loading` state to false immediately when typing detected
*   Backend may need a `cancel_chat` command or similar
*   Consider if Ollama requests can be cancelled mid-generation or if we just stop processing the response
*   Example implementation:
    ```tsx
    const handleInputChange = (e: React.ChangeEvent<HTMLInputElement>) => {
      const newValue = e.target.value;
      setInput(newValue);
      
      // If user starts typing while model is generating, interrupt
      if (loading && newValue.length > input.length) {
        setLoading(false);
        // Optionally call backend to cancel: invoke("cancel_chat")
      }
    };
    ```

## Session Management

### Problem
Users may want to start a fresh conversation without restarting the application. Long conversations can become unwieldy, and users need a way to clear context for new tasks while keeping the same project open.

### Solution: New Session Button
Provide a clear, accessible way for users to start a new session by clearing the chat history.

### Requirements

1. **Button Placement:** Located in the header area, near model controls
2. **Visual Design:** Secondary/subtle styling to prevent accidental clicks
3. **Confirmation Dialog:** Ask "Are you sure? This will clear all messages." before clearing
4. **State Management:**
   - Clear `messages` state array
   - Clear `streamingContent` if any streaming is in progress
   - Preserve project path, model selection, and tool settings
   - Cancel any in-flight backend operations before clearing
5. **User Feedback:** Immediate visual response (messages disappear)
6. **Empty State:** Show a welcome message or empty state after clearing

### Implementation Notes

**Frontend:**
- Add "New Session" button to header
- Implement confirmation modal/dialog
- Call `setMessages([])` after confirmation
- Cancel any ongoing streaming/tool execution
- Consider keyboard shortcut (e.g., Cmd/Ctrl+K)

**Backend:**
- May need to cancel ongoing chat operations
- Clear any server-side state if applicable
- No persistent session history (sessions are ephemeral)

**Edge Cases:**
- Don't clear while actively streaming (cancel first, then clear)
- Handle confirmation dismissal (do nothing)
- Ensure button is always accessible (not disabled)

### Button Label Options
- "New Session" (clear and descriptive)
- "Clear Chat" (direct but less friendly)
- "Start Over" (conversational)
- Icon: 🔄 or ⊕ (plus in circle)

## Context Window Usage Display

### Problem
Users have no visibility into how much of the model's context window they're using. This leads to:
- Unexpected quality degradation when context limit is reached
- Uncertainty about when to start a new session
- Inability to gauge conversation length

### Solution: Real-time Context Usage Indicator
Display a persistent indicator showing current token usage vs. model's context window limit.

### Requirements

1. **Visual Indicator:** Always visible in header area
2. **Real-time Updates:** Updates as messages are added
3. **Model-Aware:** Shows correct limit based on selected model
4. **Color Coding:** Visual warning as limit approaches
   - Green/default: 0-74% usage
   - Yellow/warning: 75-89% usage
   - Red/danger: 90-100% usage
5. **Clear Format:** "2.5K / 8K tokens (31%)" or similar
6. **Token Estimation:** Approximate token count for all messages

### Implementation Notes

**Token Estimation:**
- Use simple approximation: 1 token ≈ 4 characters
- Or integrate `gpt-tokenizer` for more accuracy
- Count: system prompts + user messages + assistant responses + tool outputs + tool calls

**Model Context Windows:**
- llama3.1, llama3.2: 8K tokens
- qwen2.5-coder: 32K tokens
- deepseek-coder: 16K tokens  
- Default/unknown: 8K tokens

**Calculation:**
```tsx
const estimateTokens = (text: string): number => {
  return Math.ceil(text.length / 4);
};

const calculateContextUsage = (messages: Message[], systemPrompt: string) => {
  let total = estimateTokens(systemPrompt);
  messages.forEach(msg => {
    total += estimateTokens(msg.content);
    if (msg.tool_calls) {
      total += estimateTokens(JSON.stringify(msg.tool_calls));
    }
  });
  return total;
};
```

**UI Placement:**
- Header area, near model selector
- Non-intrusive but always visible
- Optional tooltip with breakdown on hover

### Edge Cases
- Empty conversation: Show "0 / 8K"
- During streaming: Include partial content
- After clearing: Reset to 0
- Model change: Update context window limit