Parameters

Chat Completion Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	-	The model ID to use, such as `gpt-4o` or `deepseek-ai/DeepSeek-V3.2`
`messages`	array	Yes	-	The conversation payload in OpenAI-compatible message format
`temperature`	number	No	`1.0`	Sampling temperature from `0.0` to `2.0`
`top_p`	number	No	`1.0`	Nucleus sampling cutoff
`max_tokens`	integer	No	Model default	Maximum number of output tokens to generate
`stream`	boolean	No	`false`	Stream partial responses as server-sent events
`stop`	string or array	No	`null`	Up to 4 sequences that stop generation
`frequency_penalty`	number	No	`0`	Penalize repeated tokens from `-2.0` to `2.0`
`presence_penalty`	number	No	`0`	Penalize tokens that already appeared in the conversation
`response_format`	object	No	`null`	Use `{"type": "json_object"}` for JSON mode
`n`	integer	No	`1`	Number of completions to generate

Message Object

Each message in the messages array uses the following shape:

Field	Type	Required	Description
`role`	string	Yes	One of `system`, `user`, `assistant`, or `tool`
`content`	string or array	Yes	Text content, tool content, or multimodal content parts
`name`	string	No	Optional sender name
`tool_call_id`	string	No	Required when replying to a tool call with `role: "tool"`

Multimodal Content Parts

For models that support multimodal input, content can be an array of typed parts:


[
  { "type": "text", "text": "What is in this image?" },
  {
    "type": "image_url",
    "image_url": {
      "url": "https://example.com/cat.png",
      "detail": "high"
    }
  }
]

Structured Output

For basic JSON-only responses:


{
  "response_format": { "type": "json_object" }
}

More advanced schema-constrained output may work for some models, but availability depends on the upstream model/provider.

Provider-Specific Optional Fields

Some upstream models may also accept additional OpenAI-style parameters such as:

tools
tool_choice
top_k
seed
logit_bias
repetition_penalty

These are model-dependent passthrough fields. Test the target model before depending on them in production.

Temperature vs. `top_p`

In most cases, adjust either temperature or top_p, but not both at the same time.

temperature=0: nearly deterministic output.
temperature=0.5: balanced creativity and reliability.
temperature=1.5+: more creative but potentially less consistent.

Practical Guidance

Keep max_tokens below the model’s full context window so there is room for the input prompt.
Use stream=true for long generations to reduce timeout risk.
If a parameter works for one model but not another, assume provider-level differences unless the docs say otherwise.

Embeddings Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	-	Embedding model ID such as `text-embedding-3-small`
`input`	string or array	Yes	-	Input text or list of texts to embed
`user`	string	No	`null`	Optional end-user identifier for your own tracking

Most embedding requests are simpler than chat requests: you usually only need model and input.