Skip to Content
API ReferenceParameters

Parameters

Chat Completion Parameters

ParameterTypeRequiredDefaultDescription
modelstringYes-The model ID to use, such as gpt-4o or deepseek-ai/DeepSeek-V3.2
messagesarrayYes-The conversation payload in OpenAI-compatible message format
temperaturenumberNo1.0Sampling temperature from 0.0 to 2.0
top_pnumberNo1.0Nucleus sampling cutoff
max_tokensintegerNoModel defaultMaximum number of output tokens to generate
streambooleanNofalseStream partial responses as server-sent events
stopstring or arrayNonullUp to 4 sequences that stop generation
frequency_penaltynumberNo0Penalize repeated tokens from -2.0 to 2.0
presence_penaltynumberNo0Penalize tokens that already appeared in the conversation
response_formatobjectNonullUse {"type": "json_object"} for JSON mode
nintegerNo1Number of completions to generate

Message Object

Each message in the messages array uses the following shape:

FieldTypeRequiredDescription
rolestringYesOne of system, user, assistant, or tool
contentstring or arrayYesText content, tool content, or multimodal content parts
namestringNoOptional sender name
tool_call_idstringNoRequired when replying to a tool call with role: "tool"

Multimodal Content Parts

For models that support multimodal input, content can be an array of typed parts:

[ { "type": "text", "text": "What is in this image?" }, { "type": "image_url", "image_url": { "url": "https://example.com/cat.png", "detail": "high" } } ]

Structured Output

For basic JSON-only responses:

{ "response_format": { "type": "json_object" } }

More advanced schema-constrained output may work for some models, but availability depends on the upstream model/provider.

Provider-Specific Optional Fields

Some upstream models may also accept additional OpenAI-style parameters such as:

  • tools
  • tool_choice
  • top_k
  • seed
  • logit_bias
  • repetition_penalty

These are model-dependent passthrough fields. Test the target model before depending on them in production.

Temperature vs. top_p

In most cases, adjust either temperature or top_p, but not both at the same time.

  • temperature=0: nearly deterministic output.
  • temperature=0.5: balanced creativity and reliability.
  • temperature=1.5+: more creative but potentially less consistent.

Practical Guidance

  • Keep max_tokens below the model’s full context window so there is room for the input prompt.
  • Use stream=true for long generations to reduce timeout risk.
  • If a parameter works for one model but not another, assume provider-level differences unless the docs say otherwise.

Embeddings Parameters

ParameterTypeRequiredDefaultDescription
modelstringYes-Embedding model ID such as text-embedding-3-small
inputstring or arrayYes-Input text or list of texts to embed
userstringNonullOptional end-user identifier for your own tracking

Most embedding requests are simpler than chat requests: you usually only need model and input.

Last updated on