Embeddings
FlowAPI also supports an OpenAI-compatible embeddings endpoint for text vector generation.
Endpoint
POST https://api.flowapi.net/v1/embeddingsHeaders
Authorization: Bearer YOUR_FLOW_API_KEY
Content-Type: application/jsonYou can also provide X-Request-ID for tracing.
Request Body
Use one of the following examples to send an embeddings request in your preferred language.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_FLOWAPI_API_KEY",
base_url="https://api.flowapi.net/v1",
)
response = client.embeddings.create(
model="BAAI/bge-m3",
input="Generate embeddings for semantic search and retrieval with FlowAPI.",
)
print(response.data[0].embedding[:5])Parameters
model
string required
Embedding model name to call.
Example:
"BAAI/bge-m3"input
string | number[] | string[] | number[][]
default: Generate embeddings for semantic search and retrieval with FlowAPI.
required
Input text to embed. Accepts either a string or an array of tokens.
To send multiple items in one request, pass an array of strings or an array of token arrays.
The input must not be empty and must stay within the selected model’s maximum token limit.
encoding_format
enum<string> default: float
The format to return the embeddings in. Can be either float or base64.
Available options: float, base64
Example:
"float"dimensions
integer
The number of dimensions the resulting output embeddings should have. Only supported in Qwen/Qwen3 series.
Qwen/Qwen3-Embedding-8B:[64,128,256,512,768,1024,1536,2048,2560,4096]Qwen/Qwen3-Embedding-4B:[64,128,256,512,768,1024,1536,2048,2560]Qwen/Qwen3-Embedding-0.6B:[64,128,256,512,768,1024]
Example:
1024Response Example
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0123, -0.0456, 0.0789]
},
{
"object": "embedding",
"index": 1,
"embedding": [0.0021, -0.0199, 0.0555]
}
],
"model": "BAAI/bge-m3",
"usage": {
"prompt_tokens": 18,
"total_tokens": 18
}
}Typical Use Cases
- semantic search
- retrieval-augmented generation
- clustering
- similarity scoring
- recommendation systems
Guidance
- Use shorter, normalized text chunks for better retrieval behavior.
- Keep your query and document embedding strategy consistent.
- Choose an embedding-specific model instead of a chat model when possible.
- Batch related texts in a single request when you want to embed multiple items efficiently.
Embeddings support exists at the API layer, but model availability is still the source of truth. Always verify that the specific model you want to use is exposed and active in the current catalog.