Skip to Content
API ReferenceEmbeddings

Embeddings

FlowAPI also supports an OpenAI-compatible embeddings endpoint for text vector generation.

Endpoint

POST https://api.flowapi.net/v1/embeddings

Headers

Authorization: Bearer YOUR_FLOW_API_KEY Content-Type: application/json

You can also provide X-Request-ID for tracing.

Request Body

Use one of the following examples to send an embeddings request in your preferred language.

from openai import OpenAI

client = OpenAI(
  api_key="YOUR_FLOWAPI_API_KEY",
  base_url="https://api.flowapi.net/v1",
)

response = client.embeddings.create(
  model="BAAI/bge-m3",
  input="Generate embeddings for semantic search and retrieval with FlowAPI.",
)

print(response.data[0].embedding[:5])

Parameters

model

string required

Embedding model name to call.

Example:

"BAAI/bge-m3"

input

string | number[] | string[] | number[][] default: Generate embeddings for semantic search and retrieval with FlowAPI. required

Input text to embed. Accepts either a string or an array of tokens.

To send multiple items in one request, pass an array of strings or an array of token arrays.

The input must not be empty and must stay within the selected model’s maximum token limit.

encoding_format

enum<string> default: float

The format to return the embeddings in. Can be either float or base64.

Available options: float, base64

Example:

"float"

dimensions

integer

The number of dimensions the resulting output embeddings should have. Only supported in Qwen/Qwen3 series.

  • Qwen/Qwen3-Embedding-8B: [64,128,256,512,768,1024,1536,2048,2560,4096]
  • Qwen/Qwen3-Embedding-4B: [64,128,256,512,768,1024,1536,2048,2560]
  • Qwen/Qwen3-Embedding-0.6B: [64,128,256,512,768,1024]

Example:

1024

Response Example

{ "object": "list", "data": [ { "object": "embedding", "index": 0, "embedding": [0.0123, -0.0456, 0.0789] }, { "object": "embedding", "index": 1, "embedding": [0.0021, -0.0199, 0.0555] } ], "model": "BAAI/bge-m3", "usage": { "prompt_tokens": 18, "total_tokens": 18 } }

Typical Use Cases

  • semantic search
  • retrieval-augmented generation
  • clustering
  • similarity scoring
  • recommendation systems

Guidance

  • Use shorter, normalized text chunks for better retrieval behavior.
  • Keep your query and document embedding strategy consistent.
  • Choose an embedding-specific model instead of a chat model when possible.
  • Batch related texts in a single request when you want to embed multiple items efficiently.

Embeddings support exists at the API layer, but model availability is still the source of truth. Always verify that the specific model you want to use is exposed and active in the current catalog.

Last updated on