POST
/
chat
/
completions

Authorizations

Authorization
string
header
required

Use the following format for authentication: Bearer <your api key>

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far.

model
enum<string>
default:
deepseek-ai/DeepSeek-V3
required

对应的模型名称。为更好的提升服务质量,我们将不定期对本服务提供的模型做相关变更,包括但不限于模型上下线,模型服务能力调整,我们会在可行的情况下以公告、消息推送等适当的方式进行通知。

Available options:
deepseek-ai/DeepSeek-R1,
deepseek-ai/DeepSeek-V3,
deepseek-ai/DeepSeek-R1-Distill-Llama-70B,
eepseek-ai/DeepSeek-R1-Distill-Qwen-32B,
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B,
deepseek-ai/DeepSeek-R1-Distill-Llama-8B,
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B,
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B,
Pro/deepseek-ai/DeepSeek-R1-Distill-Llama-8B,
Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B,
Pro/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B,
meta-llama/Llama-3.3-70B-Instruct,
AIDC-AI/Marco-o1,
deepseek-ai/DeepSeek-V2.5,
Qwen/Qwen2.5-72B-Instruct-128K,
Qwen/Qwen2.5-72B-Instruct,
Qwen/Qwen2.5-32B-Instruct,
Qwen/Qwen2.5-14B-Instruct,
Qwen/Qwen2.5-7B-Instruct,
Qwen/Qwen2.5-Coder-32B-Instruct,
Qwen/Qwen2.5-Coder-7B-Instruct,
Qwen/Qwen2-7B-Instruct,
Qwen/Qwen2-1.5B-Instruct,
Qwen/QwQ-32B-Preview,
TeleAI/TeleChat2,
01-ai/Yi-1.5-34B-Chat-16K,
01-ai/Yi-1.5-9B-Chat-16K,
01-ai/Yi-1.5-6B-Chat,
THUDM/glm-4-9b-chat,
Vendor-A/Qwen/Qwen2.5-72B-Instruct,
internlm/internlm2_5-7b-chat,
internlm/internlm2_5-20b-chat,
nvidia/Llama-3.1-Nemotron-70B-Instruct,
meta-llama/Meta-Llama-3.1-405B-Instruct,
meta-llama/Meta-Llama-3.1-70B-Instruct,
meta-llama/Meta-Llama-3.1-8B-Instruct,
google/gemma-2-27b-it,
google/gemma-2-9b-it,
Pro/Qwen/Qwen2.5-7B-Instruct,
Pro/Qwen/Qwen2-7B-Instruct,
Pro/Qwen/Qwen2-1.5B-Instruct,
Pro/THUDM/chatglm3-6b,
Pro/THUDM/glm-4-9b-chat,
Pro/meta-llama/Meta-Llama-3.1-8B-Instruct,
Pro/google/gemma-2-9b-it
frequency_penalty
number
default:
0.5
max_tokens
integer
default:
512

The maximum number of tokens to generate.

Required range: 1 < x < 8192
n
integer
default:
1

Number of generations to return

response_format
object

An object specifying the format that the model must output.

stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

stream
boolean
default:
false

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

temperature
number
default:
0.7

Determines the degree of randomness in the response.

tools
object[]

A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.

top_k
number
default:
50
top_p
number
default:
0.7

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

Response

200 - application/json
choices
object[]
created
integer
id
string
model
string
object
enum<string>
Available options:
chat.completion
tool_calls
object[]

The tool calls generated by the model, such as function calls.

usage
object