Text Generation
Claude Messages API
POST
Documentation Index
Fetch the complete documentation index at: https://docs.qingbo.dev/llms.txt
Use this file to discover all available pages before exploring further.
- Fully compatible with the Claude Messages API format
- Supports multi-turn conversations and one-shot queries
- Supports multimodal content including text and images
Authorizations
API key used for authentication.Visit the API Key management page to obtain your API Key.Add it to the request header:
API version.Specifies which Claude API version to use.Example:
2023-06-01Body
Model name.
claude-opus-4.6— Claude 4.6 Opus, latest flagshipclaude-sonnet-4.6— Claude 4.6 Sonnet, latest versionclaude-opus-4.5— Claude 4.5 Opus flagshipclaude-sonnet-4.5— Claude 4.5 Sonnet, balancedclaude-haiku-4.5— Claude 4.5 Haiku, fast response
Message list with alternating
user and assistant roles.Maximum tokens to generate.Maximum number of tokens before generation stops. The model may stop earlier.Maximum value varies by model — refer to the model documentation.Minimum: 1
System prompt.The system prompt defines Claude’s role, personality, goals, and instructions.String format:Structured format:
Temperature, range 0–1.Controls output randomness:
- Low values (e.g., 0.2): more deterministic, more conservative
- High values (e.g., 0.8): more random, more creative
Nucleus sampling parameter, range 0–1.Uses nucleus sampling. We recommend using either
temperature or top_p, not both.Default: 1.0Top-K sampling.Sample only from the top K highest-probability options to remove “long-tail” low-probability responses.Recommended only for advanced use cases.
Whether to enable streaming output.
true: Stream the response progressively via Server-Sent Events (SSE).false: Return the full response in one go.
Stop sequences.Custom text sequences that stop generation when encountered. Up to 4 sequences, each up to 32 tokens long.
Metadata.An object used to track or identify the request.
Response
Unique identifier of the message.
Object type, always
message.Role, always
assistant.Array of message content.
Name of the model that actually served the request.
Reason generation stopped.Possible values:
end_turn— Natural completionmax_tokens— Reached max token limitstop_sequence— Stop sequence encounteredtool_use— Tool use
The triggering stop sequence (if any).
Token usage statistics.