messages

Authorizations

Authorization

string

header

required

Use the following format for authentication: Bearer <your api key>

Body

application/json

model

enum<string>

required

Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.

Available options:

Pro/deepseek-ai/DeepSeek-V3.1-Terminus,

deepseek-ai/DeepSeek-V3.1,

Pro/deepseek-ai/DeepSeek-V3.1,

deepseek-ai/DeepSeek-V3,

Pro/deepseek-ai/DeepSeek-V3,

moonshotai/Kimi-K2-Instruct-0905,

Pro/moonshotai/Kimi-K2-Instruct-0905,

moonshotai/Kimi-Dev-72B,

baidu/ERNIE-4.5-300B-A47B

Example:

"deepseek-ai/DeepSeek-V3.1"

messages

object[]

required

A list of messages comprising the conversation so far.

Required array length: 1 - 10 elements

Show child attributes

max_tokens

integer

required

The maximum number of tokens to generate before stopping.

Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.

Different models have different maximum values for this parameter. See models for details.

Example:

8192

system

System prompt.

A system prompt is a way of providing context and instructions to llm, such as specifying a particular goal or role.

stop_sequences

string[]

Custom text sequences that will cause the model to stop generating.

Our models will normally stop when they have naturally completed their turn, which will result in a response stop_reason of "end_turn".

If you want the model to stop generating when it encounters custom strings of text, you can use the stop_sequences parameter. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.

stream

boolean

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

Example:

true

temperature

number

Determines the degree of randomness in the response.

Required range: 0 <= x <= 2

Example:

0.7

top_p

number

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

Required range: 0.1 <= x <= 1

Example:

0.7

top_k

number

Required range: 0 <= x <= 50

Example:

50

tools

object[]

Each tool definition includes:

name: Name of the tool.
description: Optional, but strongly-recommended description of the tool.
input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.

Show child attributes

tool_choice

object

How the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all. The model will automatically decide whether to use tools.

Auto
Tool
None

Show child attributes

Response

200

string

type

enum<string>

default:message

Object type.

For Messages, this is always "message".

Available options:

message

role

enum<string>

default:assistant

Conversational role of the generated message.

This will always be "assistant".

Available options:

assistant

content

Tool use · object[]

Content generated by the model.

This is an array of content blocks, each of which has a type that determines its shape.

Example:

[{"type": "text", "text": "Hi"}]

If the request input messages ended with an assistant turn, then the response content will continue directly from that last turn. You can use this to constrain the model's output.

For example, if the input messages were:

[
  {"role": "user", "content": "What's the Greek name for Sun? (A) Sol (B) Helios (C) Sun"},
  {"role": "assistant", "content": "The best answer is ("}
]

Then the response content might be:

[{"type": "text", "text": "B)"}]

Show child attributes

model

string

The model that handled the request.

stop_reason

enum<string>

The reason that we stopped.

This may be one the following values:

"end_turn": the model reached a natural stopping point or one of your provided custom stop_sequences was generated
"max_tokens": we exceeded the requested max_tokens or the model's maximum
"tool_use": the model invoked one or more tools
"refusal": when streaming classifiers intervene to handle potential policy violations

In non-streaming mode this value is always non-null. In streaming mode, it is null in the message_start event and non-null otherwise.

Available options:

end_turn,

max_tokens,

tool_use,

refusal

stop_sequence

string

Which custom stop sequence was generated, if any.

This value will be a non-null string if one of your custom stop sequences was generated.

usage

object

Billing and rate-limit usage.

Show child attributes

Chat

Image

Audio

Video

Batch

Platform

Authorizations

Body

Response