Files
storkit/.living_spec/stories/18_streaming_responses_testing.md
Dave 64d1b788be Story 18: Token-by-token streaming responses
- Backend: Added OllamaProvider::chat_stream() with newline-delimited JSON parsing
- Backend: Emit chat:token events for each token received from Ollama
- Backend: Added futures dependency and stream feature for reqwest
- Frontend: Added streamingContent state and chat:token event listener
- Frontend: Real-time token display with auto-scroll
- Frontend: Markdown and syntax highlighting support for streaming content
- Fixed all TypeScript errors (tsc --noEmit)
- Fixed all Biome warnings and errors
- Fixed all Clippy warnings
- Added comprehensive code quality documentation
- Added tsc --noEmit to verification checklist

Tested and verified:
- Tokens stream in real-time
- Auto-scroll works during streaming
- Tool calls interrupt streaming correctly
- Multi-turn conversations work
- Smooth performance with no lag
2025-12-27 16:50:18 +00:00

4.2 KiB

Story 18: Streaming Responses - Testing Notes

Manual Testing Checklist

Setup

  1. Start Ollama: ollama serve
  2. Ensure a model is running: ollama list
  3. Build and run the app: npm run tauri dev

Test Cases

TC1: Basic Streaming

  • Send a simple message: "Hello, how are you?"
  • Verify tokens appear one-by-one in real-time
  • Verify smooth streaming with no lag
  • Verify message appears in the chat history after streaming completes

TC2: Long Response Streaming

  • Send: "Write a long explanation of how React hooks work"
  • Verify streaming continues smoothly for long responses
  • Verify auto-scroll keeps the latest token visible
  • Verify no UI stuttering or performance issues

TC3: Code Block Streaming

  • Send: "Show me a simple Python function"
  • Verify code blocks stream correctly
  • Verify syntax highlighting appears after streaming completes
  • Verify code formatting is preserved

TC4: Tool Calls During Streaming

  • Send: "Read the package.json file"
  • Verify streaming stops when tool call is detected
  • Verify tool execution begins immediately
  • Verify tool output appears in chat
  • Verify conversation can continue after tool execution

TC5: Multiple Turns

  • Have a 3-4 turn conversation
  • Verify each response streams correctly
  • Verify message history is maintained
  • Verify context is preserved across turns

TC6: Stop Button During Streaming

  • Send a request for a long response
  • Click the Stop button mid-stream
  • Verify streaming stops immediately
  • Verify partial response is preserved in chat
  • Verify can send new messages after stopping

TC7: Network Interruption

  • Send a request
  • Stop Ollama during streaming (simulate network error)
  • Verify graceful error handling
  • Verify partial content is preserved
  • Verify error message is shown

TC8: Fast Streaming

  • Use a fast model (e.g., llama3.1:8b)
  • Send: "Count from 1 to 20"
  • Verify UI can keep up with fast token rate
  • Verify no dropped tokens

Expected Behavior

Streaming Flow

  1. User sends message
  2. Message appears in chat immediately
  3. "Thinking..." indicator appears briefly
  4. Tokens start appearing in real-time in assistant message bubble
  5. Auto-scroll keeps latest token visible
  6. When streaming completes, chat:update event finalizes the message
  7. Message is added to history
  8. UI returns to ready state

Events

  • chat:token: Emitted for each token (payload: string)
  • chat:update: Emitted when streaming completes (payload: Message[])

UI States

  • Idle: Input enabled, no loading indicator
  • Streaming: Input disabled, streaming content visible, auto-scrolling
  • Tool Execution: Input disabled, tool output visible
  • Error: Error message visible, input re-enabled

Debugging

Backend Logs

Check terminal for Rust logs:

  • Look for "=== Ollama Request ===" to verify streaming is enabled
  • Check for streaming response parsing logs

Frontend Console

Open DevTools console:

  • Look for chat:token events
  • Look for chat:update events
  • Check for any JavaScript errors

Ollama Logs

Check Ollama logs:

journalctl -u ollama -f  # Linux
tail -f /var/log/ollama.log  # If configured

Known Issues / Limitations

  1. Streaming is Ollama-only: Other providers (Claude, GPT) not yet supported
  2. Tool outputs don't stream: Tools execute and return results all at once
  3. No streaming animations: Just simple text append, no typing effects
  4. Token buffering: Very fast streaming might batch tokens slightly

Success Criteria

All acceptance criteria from Story 18 must pass:

  • Backend emits chat:token events
  • Frontend listens and displays tokens in real-time
  • Tokens appear smoothly without lag (manual verification required)
  • Auto-scroll works during streaming (manual verification required)
  • Tool calls work correctly with streaming (manual verification required)
  • Stop button cancels streaming (manual verification required)
  • Error handling works (manual verification required)
  • Multi-turn conversations work (manual verification required)