Files

Dave 64d1b788be Story 18: Token-by-token streaming responses

- Backend: Added OllamaProvider::chat_stream() with newline-delimited JSON parsing
- Backend: Emit chat:token events for each token received from Ollama
- Backend: Added futures dependency and stream feature for reqwest
- Frontend: Added streamingContent state and chat:token event listener
- Frontend: Real-time token display with auto-scroll
- Frontend: Markdown and syntax highlighting support for streaming content
- Fixed all TypeScript errors (tsc --noEmit)
- Fixed all Biome warnings and errors
- Fixed all Clippy warnings
- Added comprehensive code quality documentation
- Added tsc --noEmit to verification checklist

Tested and verified:
- Tokens stream in real-time
- Auto-scroll works during streaming
- Tool calls interrupt streaming correctly
- Multi-turn conversations work
- Smooth performance with no lag

2025-12-27 16:50:18 +00:00

4.2 KiB

Raw Blame History

Story 18: Streaming Responses - Testing Notes

Manual Testing Checklist

Setup

Start Ollama: ollama serve
Ensure a model is running: ollama list
Build and run the app: npm run tauri dev

Test Cases

TC1: Basic Streaming

Send a simple message: "Hello, how are you?"
Verify tokens appear one-by-one in real-time
Verify smooth streaming with no lag
Verify message appears in the chat history after streaming completes

TC2: Long Response Streaming

Send: "Write a long explanation of how React hooks work"
Verify streaming continues smoothly for long responses
Verify auto-scroll keeps the latest token visible
Verify no UI stuttering or performance issues

TC3: Code Block Streaming

Send: "Show me a simple Python function"
Verify code blocks stream correctly
Verify syntax highlighting appears after streaming completes
Verify code formatting is preserved

TC4: Tool Calls During Streaming

Send: "Read the package.json file"
Verify streaming stops when tool call is detected
Verify tool execution begins immediately
Verify tool output appears in chat
Verify conversation can continue after tool execution

TC5: Multiple Turns

Have a 3-4 turn conversation
Verify each response streams correctly
Verify message history is maintained
Verify context is preserved across turns

TC6: Stop Button During Streaming

Send a request for a long response
Click the Stop button mid-stream
Verify streaming stops immediately
Verify partial response is preserved in chat
Verify can send new messages after stopping

TC7: Network Interruption

Send a request
Stop Ollama during streaming (simulate network error)
Verify graceful error handling
Verify partial content is preserved
Verify error message is shown

TC8: Fast Streaming

Use a fast model (e.g., llama3.1:8b)
Send: "Count from 1 to 20"
Verify UI can keep up with fast token rate
Verify no dropped tokens

Expected Behavior

Streaming Flow

User sends message
Message appears in chat immediately
"Thinking..." indicator appears briefly
Tokens start appearing in real-time in assistant message bubble
Auto-scroll keeps latest token visible
When streaming completes, chat:update event finalizes the message
Message is added to history
UI returns to ready state

Events

chat:token: Emitted for each token (payload: string)
chat:update: Emitted when streaming completes (payload: Message[])

UI States

Idle: Input enabled, no loading indicator
Streaming: Input disabled, streaming content visible, auto-scrolling
Tool Execution: Input disabled, tool output visible
Error: Error message visible, input re-enabled

Debugging

Backend Logs

Check terminal for Rust logs:

Look for "=== Ollama Request ===" to verify streaming is enabled
Check for streaming response parsing logs

Frontend Console

Open DevTools console:

Look for chat:token events
Look for chat:update events
Check for any JavaScript errors

Ollama Logs

Check Ollama logs:

journalctl -u ollama -f  # Linux
tail -f /var/log/ollama.log  # If configured

Known Issues / Limitations

Streaming is Ollama-only: Other providers (Claude, GPT) not yet supported
Tool outputs don't stream: Tools execute and return results all at once
No streaming animations: Just simple text append, no typing effects
Token buffering: Very fast streaming might batch tokens slightly

Success Criteria

All acceptance criteria from Story 18 must pass:

Backend emits chat:token events
Frontend listens and displays tokens in real-time
Tokens appear smoothly without lag (manual verification required)
Auto-scroll works during streaming (manual verification required)
Tool calls work correctly with streaming (manual verification required)
Stop button cancels streaming (manual verification required)
Error handling works (manual verification required)
Multi-turn conversations work (manual verification required)

4.2 KiB Raw Blame History

Story 18: Streaming Responses - Testing Notes

Manual Testing Checklist

Setup

Test Cases

TC1: Basic Streaming

TC2: Long Response Streaming

TC3: Code Block Streaming

TC4: Tool Calls During Streaming

TC5: Multiple Turns

TC6: Stop Button During Streaming

TC7: Network Interruption

TC8: Fast Streaming

Expected Behavior

Streaming Flow

Events

UI States

Debugging

Backend Logs

Frontend Console

Ollama Logs

Known Issues / Limitations

Success Criteria

4.2 KiB

Raw Blame History