Skip to Content
CapabilitiesText Generation

Text Generation

1. Model core capabilities

1.1 Basic functions

  • Text generation for a wide range of product and workflow scenarios.
  • Semantic understanding for multi-turn chat and instruction-heavy tasks.
  • Knowledge Q&A across technical, business, and general-purpose domains.
  • Code assistance for generation, explanation, and debugging.

1.2 Advanced capabilities

  • Long-context processing for large prompts and document-heavy tasks.
  • Instruction following for structured outputs and tool-friendly prompts.
  • Style control through system prompts and sampling parameters.
  • Multimodal-ready workflows where supported by the underlying model.

2. API call specifications

FlowAPI is OpenAI-compatible, so you can use the standard request format and SDKs.

2.1 Generate dialogue

from openai import OpenAI client = OpenAI( api_key="YOUR_FLOW_API_KEY", base_url="https://api.flowapi.net/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a haiku about recursion in programming."} ], temperature=0.7, max_tokens=1024, stream=True ) for chunk in response: if not chunk.choices: continue if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)

Generate JSON data

from openai import OpenAI client = OpenAI( api_key="YOUR_FLOW_API_KEY", base_url="https://api.flowapi.net/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": "You are a helpful assistant designed to output JSON."}, {"role": "user", "content": "List 3 programming languages and their primary use cases as JSON."} ], response_format={"type": "json_object"} ) print(response.choices[0].message.content)

2.2 Message structure

Message typePurposeExample
systemDefine the assistant’s role and behavior”You are a pediatrician with 10 years of experience.”
userProvide the end-user request”How should a persistent fever in a toddler be treated?”
assistantInclude previous model responses”I suggest measuring the temperature first…”

Message roles are useful when you want the model to follow layered instructions, preserve conversation state, or maintain a consistent response style across turns.

3. Model selection guide

Visit the Models page  to review currently available language models, pricing, context length, and capabilities.

4. Core parameters

4.1 Creativity controls

temperature=0.5 # Balance creativity and reliability (0.0 to 2.0) top_p=0.9 # Consider only the top 90% cumulative probability tokens

4.2 Output limits

max_tokens=1000 # Maximum generation length per request stop=["\n##", "<|end|>"] # Stop sequences frequency_penalty=0.5 # Suppress repetitive tokens (-2.0 to 2.0) stream=True # Stream output for long responses

4.3 Common issues

If output quality is inconsistent, adjust temperature, top_p, and frequency_penalty to match your use case.

Do not set max_tokens to the model’s maximum context length in every request. Leave enough room for the input context and the model’s response.

If output appears truncated:

  • Increase max_tokens.
  • Enable stream=True for long outputs.
  • Increase client-side timeout settings.

5. Error code handling

Error codeCommon causeRecommendation
400Invalid parameter formatValidate parameter ranges such as temperature
401Missing or invalid API keyVerify your API key configuration
429Rate limit exceededUse exponential backoff and retry
503 / 504Provider overload or timeoutRetry or switch to a backup model

6. Billing

Total Cost = (Input Tokens x Input Unit Price) + (Output Tokens x Output Unit Price)

Visit the Pricing page  for model-specific rates.

Model capabilities and availability change over time. Check the live Models page  for the latest information.

Last updated on