Parameters
Chat Completion Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | - | The model ID to use, such as gpt-4o or deepseek-ai/DeepSeek-V3.2 |
messages | array | Yes | - | The conversation payload in OpenAI-compatible message format |
temperature | number | No | 1.0 | Sampling temperature from 0.0 to 2.0 |
top_p | number | No | 1.0 | Nucleus sampling cutoff |
max_tokens | integer | No | Model default | Maximum number of output tokens to generate |
stream | boolean | No | false | Stream partial responses as server-sent events |
stop | string or array | No | null | Up to 4 sequences that stop generation |
frequency_penalty | number | No | 0 | Penalize repeated tokens from -2.0 to 2.0 |
presence_penalty | number | No | 0 | Penalize tokens that already appeared in the conversation |
response_format | object | No | null | Use {"type": "json_object"} for JSON mode |
n | integer | No | 1 | Number of completions to generate |
Message Object
Each message in the messages array uses the following shape:
| Field | Type | Required | Description |
|---|---|---|---|
role | string | Yes | One of system, user, assistant, or tool |
content | string or array | Yes | Text content, tool content, or multimodal content parts |
name | string | No | Optional sender name |
tool_call_id | string | No | Required when replying to a tool call with role: "tool" |
Multimodal Content Parts
For models that support multimodal input, content can be an array of typed parts:
[
{ "type": "text", "text": "What is in this image?" },
{
"type": "image_url",
"image_url": {
"url": "https://example.com/cat.png",
"detail": "high"
}
}
]Structured Output
For basic JSON-only responses:
{
"response_format": { "type": "json_object" }
}More advanced schema-constrained output may work for some models, but availability depends on the upstream model/provider.
Provider-Specific Optional Fields
Some upstream models may also accept additional OpenAI-style parameters such as:
toolstool_choicetop_kseedlogit_biasrepetition_penalty
These are model-dependent passthrough fields. Test the target model before depending on them in production.
Temperature vs. top_p
In most cases, adjust either temperature or top_p, but not both at the same time.
temperature=0: nearly deterministic output.temperature=0.5: balanced creativity and reliability.temperature=1.5+: more creative but potentially less consistent.
Practical Guidance
- Keep
max_tokensbelow the model’s full context window so there is room for the input prompt. - Use
stream=truefor long generations to reduce timeout risk. - If a parameter works for one model but not another, assume provider-level differences unless the docs say otherwise.
Embeddings Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | Yes | - | Embedding model ID such as text-embedding-3-small |
input | string or array | Yes | - | Input text or list of texts to embed |
user | string | No | null | Optional end-user identifier for your own tracking |
Most embedding requests are simpler than chat requests: you usually only need model and input.
Last updated on