Skip to main content

Benefits of Streaming

  • Better UX: Users see responses immediately
  • Perceived Performance: Feels faster even if total time is the same
  • Progressive Display: Long responses appear naturally
  • Lower Memory: Process chunks instead of waiting for full response

Implementation Guide

See the Streaming Examples for complete code samples.

Best Practices

Streams can fail mid-response. Always wrap streaming code in try-catch.
try:
    for chunk in stream:
        # Process chunk
        pass
except Exception as e:
    print(f"Stream error: {e}")
Don’t update UI for every single chunk - buffer updates for performance.
let buffer = '';
const BUFFER_SIZE = 5;

for await (const chunk of stream) {
  buffer += chunk.choices[0]?.delta?.content || '';
  if (buffer.length >= BUFFER_SIZE) {
    updateUI(buffer);
    buffer = '';
  }
}
// Flush remaining buffer
if (buffer) updateUI(buffer);
Know when the stream is done to update UI state.
for chunk in stream:
    if chunk.choices[0].finish_reason:
        # Stream is done
        break
Keep track of the complete response for later use.
full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content:
        content = chunk.choices[0].delta.content
        full_response += content
        print(content, end="")
# Now full_response contains complete text

Full Examples

See complete streaming implementations