Story 17: Display Context Window Usage with emoji indicator

- Added real-time context window usage indicator in header
- Format: emoji + percentage (🟢 52%)
- Color-coded emoji: 🟢 <75%, 🟡 <90%, 🔴 >=90%
- Hover tooltip shows full details: 'Context: 4,300 / 8,192 tokens (52%)'
- Token estimation: 1 token ≈ 4 characters
- Model-aware context windows: llama3 (8K), qwen2.5 (32K), deepseek (16K)
- Includes system prompts, messages, tool calls, and streaming content
- Updates in real-time as conversation progresses
- All quality checks passing (TypeScript, Biome, Clippy, builds)

Tested and verified:
- Shows accurate percentage of context usage
- Emoji changes color at appropriate thresholds
- Different models show correct context window sizes
- Can exceed 100% when over limit (shows red)
- Tooltip provides exact token counts
This commit is contained in:
Dave
2025-12-27 17:26:21 +00:00
parent 9965c78221
commit bd8d838457
4 changed files with 811 additions and 597 deletions

View File

@@ -338,3 +338,69 @@ Provide a clear, accessible way for users to start a new session by clearing the
- "Clear Chat" (direct but less friendly)
- "Start Over" (conversational)
- Icon: 🔄 or ⊕ (plus in circle)
## Context Window Usage Display
### Problem
Users have no visibility into how much of the model's context window they're using. This leads to:
- Unexpected quality degradation when context limit is reached
- Uncertainty about when to start a new session
- Inability to gauge conversation length
### Solution: Real-time Context Usage Indicator
Display a persistent indicator showing current token usage vs. model's context window limit.
### Requirements
1. **Visual Indicator:** Always visible in header area
2. **Real-time Updates:** Updates as messages are added
3. **Model-Aware:** Shows correct limit based on selected model
4. **Color Coding:** Visual warning as limit approaches
- Green/default: 0-74% usage
- Yellow/warning: 75-89% usage
- Red/danger: 90-100% usage
5. **Clear Format:** "2.5K / 8K tokens (31%)" or similar
6. **Token Estimation:** Approximate token count for all messages
### Implementation Notes
**Token Estimation:**
- Use simple approximation: 1 token ≈ 4 characters
- Or integrate `gpt-tokenizer` for more accuracy
- Count: system prompts + user messages + assistant responses + tool outputs + tool calls
**Model Context Windows:**
- llama3.1, llama3.2: 8K tokens
- qwen2.5-coder: 32K tokens
- deepseek-coder: 16K tokens
- Default/unknown: 8K tokens
**Calculation:**
```tsx
const estimateTokens = (text: string): number => {
return Math.ceil(text.length / 4);
};
const calculateContextUsage = (messages: Message[], systemPrompt: string) => {
let total = estimateTokens(systemPrompt);
messages.forEach(msg => {
total += estimateTokens(msg.content);
if (msg.tool_calls) {
total += estimateTokens(JSON.stringify(msg.tool_calls));
}
});
return total;
};
```
**UI Placement:**
- Header area, near model selector
- Non-intrusive but always visible
- Optional tooltip with breakdown on hover
### Edge Cases
- Empty conversation: Show "0 / 8K"
- During streaming: Include partial content
- After clearing: Reset to 0
- Model change: Update context window limit