Skip to Content
API ReferenceResponses

Responses

FlowAPI normalizes the primary chat completion response shape so you can reuse one client-side parsing strategy across many models.

Response Envelope

Typical non-streaming responses look like this:

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1713000000, "model": "deepseek-ai/DeepSeek-V3.2", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help you today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 20, "completion_tokens": 9, "total_tokens": 29 } }

Top-Level Fields

FieldTypeDescription
idstringUnique identifier for the completion
objectstringUsually chat.completion or chat.completion.chunk
createdintegerUnix timestamp when the response was created
modelstringThe model that produced the response
choicesarrayCompletion candidates returned by the model
usageobjectToken usage details, usually present on non-streaming responses

Choices

Each item in choices usually contains:

FieldTypeDescription
indexintegerPosition of the choice in the returned array
messageobjectFull assistant message for non-streaming responses
deltaobjectPartial streamed content for streaming responses
finish_reasonstring or nullWhy generation stopped

Assistant Message Shape

{ "role": "assistant", "content": "Hello! How can I help you today?" }

If the model supports tool calling, the assistant payload may also include tool_calls.

Usage Object

FlowAPI commonly returns a usage summary like this:

{ "prompt_tokens": 20, "completion_tokens": 9, "total_tokens": 29 }

Some upstream providers may include more detailed fields such as:

  • prompt_tokens_details
  • completion_tokens_details
  • reasoning_tokens
  • cached token breakdowns

Detailed usage fields vary by upstream model and provider. If your integration depends on a specific usage subfield, test it against the exact target model in production-like conditions.

Finish Reasons

Common finish_reason values include:

ValueMeaning
stopNormal completion
lengthGeneration hit a token or length limit
tool_callsThe model emitted tool call instructions
content_filterOutput was filtered by a safety policy

Additional provider-specific values may appear depending on the upstream model.

Streaming Responses

When stream=true, the response becomes an SSE stream and each event usually contains a partial delta object instead of a full message.

Example chunk:

{ "id": "chatcmpl-abc123", "object": "chat.completion.chunk", "created": 1713000001, "model": "deepseek-ai/DeepSeek-V3.2", "choices": [ { "index": 0, "delta": { "content": "Hello" }, "finish_reason": null } ] }

See Streaming for a full SSE walkthrough.

Last updated on