POST
/
chat
/
completions

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
enum<string>
default: deepseek-ai/DeepSeek-V2.5required

The name of the model to query.

Available options:
deepseek-ai/DeepSeek-V2.5,
deepseek-ai/DeepSeek-V2-Chat,
deepseek-ai/DeepSeek-Coder-V2-Instruct,
Tencent/Hunyuan-A52B-Instruct,
Qwen/Qwen2.5-72B-Instruct-128K,
Qwen/Qwen2.5-72B-Instruct,
Qwen/Qwen2-VL-72B-Instruct,
Qwen/Qwen2.5-32B-Instruct,
Qwen/Qwen2.5-14B-Instruct,
Qwen/Qwen2.5-7B-Instruct,
Qwen/Qwen2.5-Math-72B-Instruct,
Qwen/Qwen2.5-Coder-32B-Instruct,
Qwen/Qwen2.5-Coder-7B-Instruct,
Qwen/Qwen2-72B-Instruct,
Qwen/Qwen2-7B-Instruct,
Qwen/Qwen2-1.5B-Instruct,
Qwen/Qwen2-57B-A14B-Instruct,
TeleAI/TeleChat2,
TeleAI/TeleMM,
01-ai/Yi-1.5-34B-Chat-16K,
01-ai/Yi-1.5-9B-Chat-16K,
01-ai/Yi-1.5-6B-Chat,
THUDM/chatglm3-6b,
THUDM/glm-4-9b-chat,
Vendor-A/Qwen/Qwen2-72B-Instruct,
Vendor-A/Qwen/Qwen2.5-72B-Instruct,
internlm/internlm2_5-7b-chat,
internlm/internlm2_5-20b-chat,
OpenGVLab/InternVL2-Llama3-76B,
OpenGVLab/InternVL2-26B,
nvidia/Llama-3.1-Nemotron-70B-Instruct,
meta-llama/Meta-Llama-3.1-405B-Instruct,
meta-llama/Meta-Llama-3.1-70B-Instruct,
meta-llama/Meta-Llama-3.1-8B-Instruct,
meta-llama/Meta-Llama-3-8B-Instruct,
meta-llama/Meta-Llama-3-70B-Instruct,
google/gemma-2-27b-it,
google/gemma-2-9b-it,
Pro/Qwen/Qwen2.5-7B-Instruct,
Pro/Qwen/Qwen2-7B-Instruct,
Pro/Qwen/Qwen2-1.5B-Instruct,
Pro/Qwen/Qwen2-VL-7B-Instruct,
Pro/01-ai/Yi-1.5-9B-Chat-16K,
Pro/01-ai/Yi-1.5-6B-Chat,
Pro/THUDM/chatglm3-6b,
Pro/THUDM/glm-4-9b-chat,
Pro/internlm/internlm2_5-7b-chat,
Pro/OpenGVLab/InternVL2-8B,
Pro/meta-llama/Meta-Llama-3-8B-Instruct,
Pro/meta-llama/Meta-Llama-3.1-8B-Instruct,
Pro/google/gemma-2-9b-it
messages
object[]
required

A list of messages comprising the conversation so far.

messages.role
enum<string>
default: userrequired

The role of the messages author. Choice between: system, user, or assistant.

Available options:
user,
assistant,
system
messages.content
default: SiliconCloud推出分层速率方案与免费模型RPM提升10倍,对于整个大模型应用领域带来哪些改变?required

The contents of the message.

stream
boolean
default: false

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

max_tokens
integer
default: 512

The maximum number of tokens to generate.

Required range: 1 < x < 4096
stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

temperature
number
default: 0.7

Determines the degree of randomness in the response.

top_p
number
default: 0.7

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

top_k
number
default: 50
frequency_penalty
number
default: 0.5
n
integer
default: 1

Number of generations to return

response_format
object

An object specifying the format that the model must output.

response_format.type
string

The type of the response format.

tools
object[]

A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.

tools.type
enum<string>
required

The type of the tool. Currently, only function is supported.

Available options:
function
tools.function
object
required
tools.function.name
string
required

The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.

tools.function.description
string

A description of what the function does, used by the model to choose when and how to call the function.

tools.function.parameters
object

The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.

Omitting parameters defines a function with an empty parameter list.

tools.function.strict
boolean | null
default: false

Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters field. Only a subset of JSON Schema is supported when strict is true. Learn more about Structured Outputs in the function calling guide.

Response

200 - application/json
id
string
choices
object[]
choices.message
object
choices.message.role
string
choices.message.content
string
choices.finish_reason
enum<string>
Available options:
stop,
eos,
length,
tool_calls
tool_calls
object[]

The tool calls generated by the model, such as function calls.

tool_calls.id
string
required

The ID of the tool call.

tool_calls.type
enum<string>
required

The type of the tool. Currently, only function is supported.

Available options:
function
tool_calls.function
object
required

The function that the model called.

tool_calls.function.name
string
required

The name of the function to call.

tool_calls.function.arguments
string
required

The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

usage
object
usage.prompt_tokens
integer
usage.completion_tokens
integer
usage.total_tokens
integer
created
integer
model
string
object
enum<string>
Available options:
chat.completion