POST
/
chat
/
completions

Authorizations

Authorization
string
header
required

Use the following format for authentication: Bearer <your api key>

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far.

model
enum<string>
default:
Qwen/QVQ-72B-Preview
required

对应的模型名称。为更好的提升服务质量,我们将不定期对本服务提供的模型做相关变更,包括但不限于模型上下线,模型服务能力调整,我们会在可行的情况下以公告、消息推送等适当的方式进行通知。

Available options:
Qwen/QVQ-72B-Preview,
deepseek-ai/deepseek-vl2,
Qwen/Qwen2-VL-72B-Instruct,
OpenGVLab/InternVL2-26B,
Pro/Qwen/Qwen2-VL-7B-Instruct,
Pro/OpenGVLab/InternVL2-8B,
TeleAI/TeleMM
frequency_penalty
number
default:
0.5
max_tokens
integer
default:
512

The maximum number of tokens to generate.

Required range: 1 < x < 4096
n
integer
default:
1

Number of generations to return

response_format
object

An object specifying the format that the model must output.

stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

stream
boolean
default:
false

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

temperature
number
default:
0.7

Determines the degree of randomness in the response.

top_k
number
default:
50
top_p
number
default:
0.7

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

Response

200 - application/json
choices
object[]
created
integer
id
string
model
string
object
enum<string>
Available options:
chat.completion
tool_calls
object[]

The tool calls generated by the model, such as function calls.

usage
object