Overview

FlowAPI’s API is designed to be OpenAI-compatible for the most common inference workflows. At a high level, you send requests to one base URL, authenticate with one API key format, and use one response style across multiple models.

Base URL

Use the following base URL for model inference:


https://api.flowapi.net/v1

Health checks are exposed separately at:


https://api.flowapi.net/healthz

Authentication

Send your API key in the Authorization header:


Authorization: Bearer YOUR_FLOW_API_KEY
Content-Type: application/json

FlowAPI gateway-issued API keys currently use the fk- prefix.

You can also provide an optional X-Request-ID header for tracing. If omitted, FlowAPI generates one and returns it in the response headers.

Primary Endpoints

The two most important endpoint references are:

Request Overview

For POST /chat/completions, the request body usually looks like this:


type Request = {
  model: string;
  messages: Message[];
 
  response_format?: { type: "json_object" } | object;
  stop?: string | string[];
  stream?: boolean;
 
  max_tokens?: number;
  temperature?: number;
  top_p?: number;
  frequency_penalty?: number;
  presence_penalty?: number;
  n?: number;
 
  // Provider-specific fields may be forwarded when supported by
  // the selected upstream model, but compatibility is model-dependent.
  [key: string]: unknown;
};
 
type Message =
  | {
      role: "system" | "user" | "assistant";
      content: string | ContentPart[];
      name?: string;
    }
  | {
      role: "tool";
      content: string;
      tool_call_id: string;
      name?: string;
    };
 
type ContentPart =
  | {
      type: "text";
      text: string;
    }
  | {
      type: "image_url";
      image_url: {
        url: string;
        detail?: "low" | "high" | "auto" | string;
      };
    };

The exact set of advanced parameters depends on the selected model provider. FlowAPI keeps the main request shape familiar, but provider-specific extras are not guaranteed to work uniformly across every model.

Example Request


{
  "model": "deepseek-ai/DeepSeek-V3.2",
  "messages": [
    { "role": "system", "content": "You are a concise technical assistant." },
    { "role": "user", "content": "Summarize the benefits of streaming responses." }
  ],
  "temperature": 0.7,
  "max_tokens": 1024,
  "stream": false
}

For vector generation, FlowAPI also supports POST /v1/embeddings with an OpenAI-compatible request body using model and input.

Structured Output

For basic JSON output, use response_format: { "type": "json_object" }.

See Structured Outputs for examples and support caveats.

Streaming

FlowAPI supports Server-Sent Events (SSE) for streaming responses.

Set stream: true in the request body, and parse delta chunks as they arrive.

See Streaming for end-to-end examples.

Response Overview

Non-streaming responses return a full message; streaming responses return incremental delta chunks.

See Responses for the full schema and field explanations.

Finish Reasons

Common finish_reason values include:

stop: normal completion
length: output stopped because token limit was reached
tool_calls: the model emitted tool call instructions
content_filter: output was filtered by a safety policy

Some upstream providers may return additional provider-specific values.

Provider-Specific Parameters

FlowAPI keeps the primary schema stable, but some models may accept additional OpenAI-style or provider-specific fields such as:

tools
tool_choice
top_k
logit_bias
repetition_penalty
seed

These fields may be forwarded when the upstream model supports them, but they are not guaranteed to behave consistently across all providers.

See Tool Calling for tools and tool_choice details.

Error Shape

Gateway-generated errors follow this structure:


{
  "error": {
    "message": "Missing or invalid Bearer API key.",
    "type": "invalid_api_key",
    "code": "401"
  }
}

See the dedicated Error Codes page for the full matrix.