Text Generation
1. Model core capabilities
1.1 Basic functions
- Text generation for a wide range of product and workflow scenarios.
- Semantic understanding for multi-turn chat and instruction-heavy tasks.
- Knowledge Q&A across technical, business, and general-purpose domains.
- Code assistance for generation, explanation, and debugging.
1.2 Advanced capabilities
- Long-context processing for large prompts and document-heavy tasks.
- Instruction following for structured outputs and tool-friendly prompts.
- Style control through system prompts and sampling parameters.
- Multimodal-ready workflows where supported by the underlying model.
2. API call specifications
FlowAPI is OpenAI-compatible, so you can use the standard request format and SDKs.
2.1 Generate dialogue
from openai import OpenAI
client = OpenAI(
api_key="YOUR_FLOW_API_KEY",
base_url="https://api.flowapi.net/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a haiku about recursion in programming."}
],
temperature=0.7,
max_tokens=1024,
stream=True
)
for chunk in response:
if not chunk.choices:
continue
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Generate JSON data
from openai import OpenAI
client = OpenAI(
api_key="YOUR_FLOW_API_KEY",
base_url="https://api.flowapi.net/v1"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant designed to output JSON."},
{"role": "user", "content": "List 3 programming languages and their primary use cases as JSON."}
],
response_format={"type": "json_object"}
)
print(response.choices[0].message.content)2.2 Message structure
| Message type | Purpose | Example |
|---|---|---|
system | Define the assistant’s role and behavior | ”You are a pediatrician with 10 years of experience.” |
user | Provide the end-user request | ”How should a persistent fever in a toddler be treated?” |
assistant | Include previous model responses | ”I suggest measuring the temperature first…” |
Message roles are useful when you want the model to follow layered instructions, preserve conversation state, or maintain a consistent response style across turns.
3. Model selection guide
Visit the Models page to review currently available language models, pricing, context length, and capabilities.
4. Core parameters
4.1 Creativity controls
temperature=0.5 # Balance creativity and reliability (0.0 to 2.0)
top_p=0.9 # Consider only the top 90% cumulative probability tokens4.2 Output limits
max_tokens=1000 # Maximum generation length per request
stop=["\n##", "<|end|>"] # Stop sequences
frequency_penalty=0.5 # Suppress repetitive tokens (-2.0 to 2.0)
stream=True # Stream output for long responses4.3 Common issues
If output quality is inconsistent, adjust temperature, top_p, and frequency_penalty to match your use case.
Do not set max_tokens to the model’s maximum context length in every request. Leave enough room for the input context and the model’s response.
If output appears truncated:
- Increase
max_tokens. - Enable
stream=Truefor long outputs. - Increase client-side timeout settings.
5. Error code handling
| Error code | Common cause | Recommendation |
|---|---|---|
| 400 | Invalid parameter format | Validate parameter ranges such as temperature |
| 401 | Missing or invalid API key | Verify your API key configuration |
| 429 | Rate limit exceeded | Use exponential backoff and retry |
| 503 / 504 | Provider overload or timeout | Retry or switch to a backup model |
6. Billing
Total Cost = (Input Tokens x Input Unit Price) + (Output Tokens x Output Unit Price)Visit the Pricing page for model-specific rates.
Model capabilities and availability change over time. Check the live Models page for the latest information.