创建文本对话请求
Creates a model response for the given chat conversation.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
The name of the model to query.
deepseek-ai/DeepSeek-V2.5
, deepseek-ai/DeepSeek-V2-Chat
, deepseek-ai/DeepSeek-Coder-V2-Instruct
, Tencent/Hunyuan-A52B-Instruct
, Qwen/Qwen2.5-72B-Instruct-128K
, Qwen/Qwen2.5-72B-Instruct
, Qwen/Qwen2-VL-72B-Instruct
, Qwen/Qwen2.5-32B-Instruct
, Qwen/Qwen2.5-14B-Instruct
, Qwen/Qwen2.5-7B-Instruct
, Qwen/Qwen2.5-Math-72B-Instruct
, Qwen/Qwen2.5-Coder-32B-Instruct
, Qwen/Qwen2.5-Coder-7B-Instruct
, Qwen/Qwen2-72B-Instruct
, Qwen/Qwen2-7B-Instruct
, Qwen/Qwen2-1.5B-Instruct
, Qwen/Qwen2-57B-A14B-Instruct
, TeleAI/TeleChat2
, TeleAI/TeleMM
, 01-ai/Yi-1.5-34B-Chat-16K
, 01-ai/Yi-1.5-9B-Chat-16K
, 01-ai/Yi-1.5-6B-Chat
, THUDM/chatglm3-6b
, THUDM/glm-4-9b-chat
, Vendor-A/Qwen/Qwen2-72B-Instruct
, Vendor-A/Qwen/Qwen2.5-72B-Instruct
, internlm/internlm2_5-7b-chat
, internlm/internlm2_5-20b-chat
, OpenGVLab/InternVL2-Llama3-76B
, OpenGVLab/InternVL2-26B
, nvidia/Llama-3.1-Nemotron-70B-Instruct
, meta-llama/Meta-Llama-3.1-405B-Instruct
, meta-llama/Meta-Llama-3.1-70B-Instruct
, meta-llama/Meta-Llama-3.1-8B-Instruct
, meta-llama/Meta-Llama-3-8B-Instruct
, meta-llama/Meta-Llama-3-70B-Instruct
, google/gemma-2-27b-it
, google/gemma-2-9b-it
, Pro/Qwen/Qwen2.5-7B-Instruct
, Pro/Qwen/Qwen2-7B-Instruct
, Pro/Qwen/Qwen2-1.5B-Instruct
, Pro/Qwen/Qwen2-VL-7B-Instruct
, Pro/01-ai/Yi-1.5-9B-Chat-16K
, Pro/01-ai/Yi-1.5-6B-Chat
, Pro/THUDM/chatglm3-6b
, Pro/THUDM/glm-4-9b-chat
, Pro/internlm/internlm2_5-7b-chat
, Pro/OpenGVLab/InternVL2-8B
, Pro/meta-llama/Meta-Llama-3-8B-Instruct
, Pro/meta-llama/Meta-Llama-3.1-8B-Instruct
, Pro/google/gemma-2-9b-it
A list of messages comprising the conversation so far.
The role of the messages author. Choice between: system, user, or assistant.
user
, assistant
, system
The contents of the message.
If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]
The maximum number of tokens to generate.
1 < x < 4096
Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
Determines the degree of randomness in the response.
The top_p
(nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.
Number of generations to return
An object specifying the format that the model must output.
The type of the response format.
A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
The type of the tool. Currently, only function
is supported.
function
The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
A description of what the function does, used by the model to choose when and how to call the function.
The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting parameters
defines a function with an empty parameter list.
Whether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the parameters
field. Only a subset of JSON Schema is supported when strict
is true
. Learn more about Structured Outputs in the function calling guide.
Response
stop
, eos
, length
, tool_calls
The tool calls generated by the model, such as function calls.
The ID of the tool call.
The type of the tool. Currently, only function
is supported.
function
The function that the model called.
The name of the function to call.
The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.
chat.completion