Embeddings

FlowAPI also supports an OpenAI-compatible embeddings endpoint for text vector generation.

Endpoint


POST https://api.flowapi.net/v1/embeddings

Headers


Authorization: Bearer YOUR_FLOW_API_KEY
Content-Type: application/json

You can also provide X-Request-ID for tracing.

Request Body

Use one of the following examples to send an embeddings request in your preferred language.

from openai import OpenAI

client = OpenAI(
  api_key="YOUR_FLOWAPI_API_KEY",
  base_url="https://api.flowapi.net/v1",
)

response = client.embeddings.create(
  model="BAAI/bge-m3",
  input="Generate embeddings for semantic search and retrieval with FlowAPI.",
)

print(response.data[0].embedding[:5])

Parameters

`model`

string required

Embedding model name to call.

Example:


"BAAI/bge-m3"

`input`

string | number[] | string[] | number[][] default: Generate embeddings for semantic search and retrieval with FlowAPI. required

Input text to embed. Accepts either a string or an array of tokens.

To send multiple items in one request, pass an array of strings or an array of token arrays.

The input must not be empty and must stay within the selected model’s maximum token limit.

`encoding_format`

enum<string> default: float

The format to return the embeddings in. Can be either float or base64.

Available options: float, base64

Example:


"float"

`dimensions`

integer

The number of dimensions the resulting output embeddings should have. Only supported in Qwen/Qwen3 series.

Qwen/Qwen3-Embedding-8B: [64,128,256,512,768,1024,1536,2048,2560,4096]
Qwen/Qwen3-Embedding-4B: [64,128,256,512,768,1024,1536,2048,2560]
Qwen/Qwen3-Embedding-0.6B: [64,128,256,512,768,1024]

Example:

Response Example


{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0123, -0.0456, 0.0789]
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [0.0021, -0.0199, 0.0555]
    }
  ],
  "model": "BAAI/bge-m3",
  "usage": {
    "prompt_tokens": 18,
    "total_tokens": 18
  }
}

Typical Use Cases

semantic search
retrieval-augmented generation
clustering
similarity scoring
recommendation systems

Guidance

Use shorter, normalized text chunks for better retrieval behavior.
Keep your query and document embedding strategy consistent.
Choose an embedding-specific model instead of a chat model when possible.
Batch related texts in a single request when you want to embed multiple items efficiently.

Embeddings support exists at the API layer, but model availability is still the source of truth. Always verify that the specific model you want to use is exposed and active in the current catalog.