Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.qingbo.dev/llms.txt

Use this file to discover all available pages before exploring further.

QWave API supports multiple image generation models, all called through the unified async task endpoint /v1/tasks. Image generation typically completes in 1-30 seconds. After submission, you receive a task_id and retrieve results via polling or webhook callback.

Endpoint

POST https://www.qingbo.dev/v1/tasks
For the async flow, see Task System — Submit → Status → Result + optional webhook.

Supported Models

Gemini Image

Google Gemini Image — Nano Banana series (2.5 Flash / 3 Pro / 3.1 Flash) + official direct

GPT Image

OpenAI GPT-Image — 1 / 1.5 / 2, multimodal generation + edit + inpainting

Seedream

ByteDance Doubao Seedream — 4.0 / 4.5 / 5.0-lite, up to 14 reference images, sequential image groups

Imagen

Google Imagen 4.0 — flagship text-to-image, native 2K, strong CN/EN text rendering

Qwen Image

Alibaba Tongyi Qwen Image 2.0 / Pro, 1K/2K, strong text rendering

Wan Image

Alibaba Tongyi Wan 2.7 — 6 actions, image groups / bbox selection / color themes

Z-Image

Zhipu Z.ai Z-Image-Turbo — lightweight and fast, bilingual CN/EN

Common Parameters (shared across models)

Actual support varies per model — see each vendor doc for details.
model
string
required
Model ID (group_name); pick one from the vendor docs above
action
string
default:"generate"
Operation type. Allowed values depend on the model. Common ones:
  • generate — text-to-image (default)
  • image2image — image-to-image (requires image_urls)
  • edit — image editing (inpainting / lighting changes)
  • reference — multi-image reference fusion
  • inpaint — inpainting (requires mask_url, GPT-Image series)
  • group — sequential image group (a thematically related set)
  • interactive_edit — interactive editing with bbox selection (Wan Image only)
prompt
string
required
Image description text; supports both Chinese and English
n
integer
default:"1"
Number of images to generate (some vendors cap per-call count)
seed
integer
default:"-1"
Random seed; -1 means random. A fixed value reproduces similar results.
aspect_ratio
string
Aspect ratio, e.g. 16:9 / 9:16 / 1:1. Some vendors support auto (smart selection).
resolution
string
Output resolution, e.g. 1K / 2K / 4K. Supported range varies per vendor.
image_urls
string[]
Reference image URL array (for image-to-image / multi-image reference)
callback_url
string
Webhook callback URL, invoked when the task reaches a terminal state. See Task System.

Submit Response Example

{
  "task_id": "task-wave1775285160b950328499",
  "model": "doubao-seedream-4.5",
  "action": "generate",
  "status": "queued",
  "created_at": 1775285160040,
  "progress": 0
}
After receiving the task_id, call GET /v1/tasks/{task_id} to poll until status = completed, then read the image URLs.

Mode Quick Reference

ModeTrigger FieldsTypical Vendors
Text-to-image (T2I)prompt onlyAll
Image-to-image / multi-image reference+ image_urls (single or multiple)Seedream / Gemini / GPT-Image / Wan / Qwen
Image editing (inpainting / lighting changes)+ image_urls + action: editSeedream 4.0/4.5 / Gemini / GPT-Image / Wan
Inpainting (with mask)+ image_urls + mask_url + action: inpaintGPT-Image series
Sequential image group (thematic set)sequential_image_generation: auto or enable_sequential: trueSeedream / Wan
Interactive editing (bbox selection)+ image_urls + bbox_listWan Image
Search-augmented generationgoogle_search: true or google_image_search: trueGemini 3.1 Flash
Not every model supports every mode — check each vendor doc for the actual action list and field range. The backend validates request fields against the vendor’s declared capabilities and rejects out-of-range requests.

Field Naming Conventions

  • Media references are always plural — always image_urls, even for a single image use a one-element array ["one.jpg"]
  • Aspect ratio is unified as aspect_ratio — vendor internals using size / ratio are implementation details you don’t need to track
  • Resolution is unified as resolution — vendor internals using quality are implementation details
  • Mask field mask_url (used for GPT-Image series inpainting)