QWave API supports multiple image generation models, all called through the unified async task endpointDocumentation Index
Fetch the complete documentation index at: https://docs.qingbo.dev/llms.txt
Use this file to discover all available pages before exploring further.
/v1/tasks. Image generation typically completes in 1-30 seconds. After submission, you receive a task_id and retrieve results via polling or webhook callback.
Endpoint
Supported Models
Gemini Image
Google Gemini Image — Nano Banana series (2.5 Flash / 3 Pro / 3.1 Flash) + official direct
GPT Image
OpenAI GPT-Image — 1 / 1.5 / 2, multimodal generation + edit + inpainting
Seedream
ByteDance Doubao Seedream — 4.0 / 4.5 / 5.0-lite, up to 14 reference images, sequential image groups
Imagen
Google Imagen 4.0 — flagship text-to-image, native 2K, strong CN/EN text rendering
Qwen Image
Alibaba Tongyi Qwen Image 2.0 / Pro, 1K/2K, strong text rendering
Wan Image
Alibaba Tongyi Wan 2.7 — 6 actions, image groups / bbox selection / color themes
Z-Image
Zhipu Z.ai Z-Image-Turbo — lightweight and fast, bilingual CN/EN
Common Parameters (shared across models)
Actual support varies per model — see each vendor doc for details.
Model ID (group_name); pick one from the vendor docs above
Operation type. Allowed values depend on the model. Common ones:
generate— text-to-image (default)image2image— image-to-image (requiresimage_urls)edit— image editing (inpainting / lighting changes)reference— multi-image reference fusioninpaint— inpainting (requiresmask_url, GPT-Image series)group— sequential image group (a thematically related set)interactive_edit— interactive editing with bbox selection (Wan Image only)
Image description text; supports both Chinese and English
Number of images to generate (some vendors cap per-call count)
Random seed;
-1 means random. A fixed value reproduces similar results.Aspect ratio, e.g.
16:9 / 9:16 / 1:1. Some vendors support auto (smart selection).Output resolution, e.g.
1K / 2K / 4K. Supported range varies per vendor.Reference image URL array (for image-to-image / multi-image reference)
Webhook callback URL, invoked when the task reaches a terminal state. See Task System.
Submit Response Example
task_id, call GET /v1/tasks/{task_id} to poll until status = completed, then read the image URLs.
Mode Quick Reference
| Mode | Trigger Fields | Typical Vendors |
|---|---|---|
| Text-to-image (T2I) | prompt only | All |
| Image-to-image / multi-image reference | + image_urls (single or multiple) | Seedream / Gemini / GPT-Image / Wan / Qwen |
| Image editing (inpainting / lighting changes) | + image_urls + action: edit | Seedream 4.0/4.5 / Gemini / GPT-Image / Wan |
| Inpainting (with mask) | + image_urls + mask_url + action: inpaint | GPT-Image series |
| Sequential image group (thematic set) | sequential_image_generation: auto or enable_sequential: true | Seedream / Wan |
| Interactive editing (bbox selection) | + image_urls + bbox_list | Wan Image |
| Search-augmented generation | google_search: true or google_image_search: true | Gemini 3.1 Flash |
Not every model supports every mode — check each vendor doc for the actual
action list and field range. The backend validates request fields against the vendor’s declared capabilities and rejects out-of-range requests.Field Naming Conventions
- Media references are always plural — always
image_urls, even for a single image use a one-element array["one.jpg"] - Aspect ratio is unified as
aspect_ratio— vendor internals usingsize/ratioare implementation details you don’t need to track - Resolution is unified as
resolution— vendor internals usingqualityare implementation details - Mask field
mask_url(used for GPT-Image series inpainting)
Related
- Task System Reference — task state machine / polling cadence / webhook
- Request & Response — common error codes / headers / rate limits
- Authentication — API key application and usage
- Models — full model lookup endpoint