Text Generation

1. Model core capabilities

1.1 Basic functions

Text generation for a wide range of product and workflow scenarios.
Semantic understanding for multi-turn chat and instruction-heavy tasks.
Knowledge Q&A across technical, business, and general-purpose domains.
Code assistance for generation, explanation, and debugging.

1.2 Advanced capabilities

Long-context processing for large prompts and document-heavy tasks.
Instruction following for structured outputs and tool-friendly prompts.
Style control through system prompts and sampling parameters.
Multimodal-ready workflows where supported by the underlying model.

2. API call specifications

FlowAPI is OpenAI-compatible, so you can use the standard request format and SDKs.

2.1 Generate dialogue


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_FLOW_API_KEY",
    base_url="https://api.flowapi.net/v1"
)
 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Write a haiku about recursion in programming."}
    ],
    temperature=0.7,
    max_tokens=1024,
    stream=True
)
 
for chunk in response:
    if not chunk.choices:
        continue
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Generate JSON data


from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_FLOW_API_KEY",
    base_url="https://api.flowapi.net/v1"
)
 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
        {"role": "user", "content": "List 3 programming languages and their primary use cases as JSON."}
    ],
    response_format={"type": "json_object"}
)
 
print(response.choices[0].message.content)

2.2 Message structure

Message type	Purpose	Example
`system`	Define the assistant’s role and behavior	”You are a pediatrician with 10 years of experience.”
`user`	Provide the end-user request	”How should a persistent fever in a toddler be treated?”
`assistant`	Include previous model responses	”I suggest measuring the temperature first…”

Message roles are useful when you want the model to follow layered instructions, preserve conversation state, or maintain a consistent response style across turns.

3. Model selection guide

Visit the Models page to review currently available language models, pricing, context length, and capabilities.

4. Core parameters

4.1 Creativity controls


temperature=0.5   # Balance creativity and reliability (0.0 to 2.0)
top_p=0.9         # Consider only the top 90% cumulative probability tokens

4.2 Output limits


max_tokens=1000              # Maximum generation length per request
stop=["\n##", "<|end|>"]     # Stop sequences
frequency_penalty=0.5        # Suppress repetitive tokens (-2.0 to 2.0)
stream=True                  # Stream output for long responses

4.3 Common issues

If output quality is inconsistent, adjust temperature, top_p, and frequency_penalty to match your use case.

Do not set max_tokens to the model’s maximum context length in every request. Leave enough room for the input context and the model’s response.

If output appears truncated:

Increase max_tokens.
Enable stream=True for long outputs.
Increase client-side timeout settings.

5. Error code handling

Error code	Common cause	Recommendation
400	Invalid parameter format	Validate parameter ranges such as `temperature`
401	Missing or invalid API key	Verify your API key configuration
429	Rate limit exceeded	Use exponential backoff and retry
503 / 504	Provider overload or timeout	Retry or switch to a backup model

6. Billing


Total Cost = (Input Tokens x Input Unit Price) + (Output Tokens x Output Unit Price)

Visit the Pricing page for model-specific rates.

Model capabilities and availability change over time. Check the live Models page for the latest information.