Streaming
FlowAPI supports Server-Sent Events (SSE) for chat completion streaming. Streaming is recommended for long outputs because it reduces perceived latency and helps avoid timeout issues on slow or verbose generations.
Enable Streaming
Set stream to true in the request body:
{
"model": "deepseek-ai/DeepSeek-V3.2",
"messages": [
{ "role": "user", "content": "Write a long tutorial about SSE." }
],
"stream": true
}HTTP Example
curl https://api.flowapi.net/v1/chat/completions \
-H "Authorization: Bearer YOUR_FLOW_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-V3.2",
"messages": [
{ "role": "user", "content": "Write a haiku about distributed systems." }
],
"stream": true
}'Python Example
from openai import OpenAI
client = OpenAI(
api_key="YOUR_FLOW_API_KEY",
base_url="https://api.flowapi.net/v1"
)
stream = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3.2",
messages=[{"role": "user", "content": "Write a haiku about distributed systems."}],
stream=True,
)
for chunk in stream:
if not chunk.choices:
continue
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)Event Shape
Each streamed chunk usually contains:
{
"id": "chatcmpl-abc123",
"object": "chat.completion.chunk",
"created": 1713000001,
"model": "deepseek-ai/DeepSeek-V3.2",
"choices": [
{
"index": 0,
"delta": {
"content": "Hello"
},
"finish_reason": null
}
]
}The final chunk usually sets finish_reason and may include usage details depending on the upstream provider.
Parsing Guidance
- Ignore chunks with empty
choicesunless you specifically need final usage info. - Append
delta.contentin arrival order. - Watch
finish_reasonto know when generation is complete. - Handle network disconnects separately from model-side errors.
Best Practices
- Use streaming for long responses, chain-of-thought-adjacent tasks, or large documents.
- Increase client-side timeout settings even when using SSE.
- Do not assume every chunk contains text; some may contain role markers or tool call data.
Some upstream providers may emit slightly different chunk timing or optional fields. Your client should be tolerant of missing content values and provider-specific metadata.
Last updated on