Image Generation Overview

QWave API supports multiple image generation models, all called through the unified async task endpoint /v1/tasks. Image generation typically completes in 1-30 seconds. After submission, you receive a task_id and retrieve results via polling or webhook callback.

Endpoint

POST https://www.qingbo.dev/v1/tasks

For the async flow, see Task System — Submit → Status → Result + optional webhook.

Supported Models

Gemini Image

Google Gemini Image — Nano Banana series (2.5 Flash / 3 Pro / 3.1 Flash) + official direct

GPT Image

OpenAI GPT-Image — 1 / 1.5 / 2, multimodal generation + edit + inpainting

Seedream

ByteDance Doubao Seedream — 4.0 / 4.5 / 5.0-lite, up to 14 reference images, sequential image groups

Imagen

Google Imagen 4.0 — flagship text-to-image, native 2K, strong CN/EN text rendering

Qwen Image

Alibaba Tongyi Qwen Image 2.0 / Pro, 1K/2K, strong text rendering

Wan Image

Alibaba Tongyi Wan 2.7 — 6 actions, image groups / bbox selection / color themes

Z-Image

Zhipu Z.ai Z-Image-Turbo — lightweight and fast, bilingual CN/EN

Common Parameters (shared across models)

Actual support varies per model — see each vendor doc for details.

model

string

required

Model ID (group_name); pick one from the vendor docs above

action

string

default:"generate"

Operation type. Allowed values depend on the model. Common ones:

generate — text-to-image (default)
image2image — image-to-image (requires image_urls)
edit — image editing (inpainting / lighting changes)
reference — multi-image reference fusion
inpaint — inpainting (requires mask_url, GPT-Image series)
group — sequential image group (a thematically related set)
interactive_edit — interactive editing with bbox selection (Wan Image only)

prompt

string

required

Image description text; supports both Chinese and English

integer

default:"1"

Number of images to generate (some vendors cap per-call count)

seed

integer

default:"-1"

Random seed; -1 means random. A fixed value reproduces similar results.

aspect_ratio

string

Aspect ratio, e.g. 16:9 / 9:16 / 1:1. Some vendors support auto (smart selection).

resolution

string

Output resolution, e.g. 1K / 2K / 4K. Supported range varies per vendor.

image_urls

string[]

Reference image URL array (for image-to-image / multi-image reference)

callback_url

string

Webhook callback URL, invoked when the task reaches a terminal state. See Task System.

Submit Response Example

{
  "task_id": "task-wave1775285160b950328499",
  "model": "doubao-seedream-4.5",
  "action": "generate",
  "status": "queued",
  "created_at": 1775285160040,
  "progress": 0
}

After receiving the task_id, call GET /v1/tasks/{task_id} to poll until status = completed, then read the image URLs.

Mode Quick Reference

Mode	Trigger Fields	Typical Vendors
Text-to-image (T2I)	`prompt` only	All
Image-to-image / multi-image reference	`+ image_urls` (single or multiple)	Seedream / Gemini / GPT-Image / Wan / Qwen
Image editing (inpainting / lighting changes)	`+ image_urls + action: edit`	Seedream 4.0/4.5 / Gemini / GPT-Image / Wan
Inpainting (with mask)	`+ image_urls + mask_url + action: inpaint`	GPT-Image series
Sequential image group (thematic set)	`sequential_image_generation: auto` or `enable_sequential: true`	Seedream / Wan
Interactive editing (bbox selection)	`+ image_urls + bbox_list`	Wan Image
Search-augmented generation	`google_search: true` or `google_image_search: true`	Gemini 3.1 Flash

Not every model supports every mode — check each vendor doc for the actual action list and field range. The backend validates request fields against the vendor’s declared capabilities and rejects out-of-range requests.

Field Naming Conventions

Media references are always plural — always image_urls, even for a single image use a one-element array ["one.jpg"]
Aspect ratio is unified as aspect_ratio — vendor internals using size / ratio are implementation details you don’t need to track
Resolution is unified as resolution — vendor internals using quality are implementation details
Mask field mask_url (used for GPT-Image series inpainting)

Task System Reference — task state machine / polling cadence / webhook
Request & Response — common error codes / headers / rate limits
Authentication — API key application and usage
Models — full model lookup endpoint

​Endpoint

​Supported Models

Gemini Image

GPT Image

Seedream

Imagen

Qwen Image

Wan Image

Z-Image

​Common Parameters (shared across models)

​Submit Response Example

​Mode Quick Reference

​Field Naming Conventions

​Related

Endpoint

Supported Models

Common Parameters (shared across models)

Submit Response Example

Mode Quick Reference

Field Naming Conventions

Related