POST
/
chat
/
completions

Authorizations

Authorization
string
header
required

Use the following format for authentication: Bearer <your api key>

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far.

messages.content
object[]
required

An array of content parts with a defined type, each can be of type text or image_url when passing in images. You can pass multiple images by adding multiple image_url content parts.

messages.content.image_url
object
required
messages.content.image_url.url
string
default:
https://sf-maas-uat-prod.oss-cn-shanghai.aliyuncs.com/dog.png
required

Either a URL of the image or the base64 encoded image data. TeleAI/TeleMM only support the base64 encoded image data.

messages.content.image_url.detail
enum<string>
default:
auto

Specifies the detail level of the image.

Available options:
auto,
low,
high
messages.content.type
enum<string>
default:
image_url
required

The type of the content part.

Available options:
image_url
messages.role
enum<string>
default:
user
required

The role of the messages author. Choice between: system, user, or assistant.

Available options:
user,
assistant,
system
model
enum<string>
default:
Qwen/QVQ-72B-Preview
required

对应的模型名称。为更好的提升服务质量,我们将不定期对本服务提供的模型做相关变更,包括但不限于模型上下线,模型服务能力调整,我们会在可行的情况下以公告、消息推送等适当的方式进行通知。

Available options:
Qwen/QVQ-72B-Preview,
deepseek-ai/deepseek-vl2,
Qwen/Qwen2-VL-72B-Instruct,
OpenGVLab/InternVL2-26B,
Pro/Qwen/Qwen2-VL-7B-Instruct,
Pro/OpenGVLab/InternVL2-8B,
TeleAI/TeleMM
frequency_penalty
number
default:
0.5
max_tokens
integer
default:
512

The maximum number of tokens to generate.

Required range: 1 < x < 4096
n
integer
default:
1

Number of generations to return

response_format
object

An object specifying the format that the model must output.

response_format.type
string

The type of the response format.

stop

Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.

stream
boolean
default:
false

If set, tokens are returned as Server-Sent Events as they are made available. Stream terminates with data: [DONE]

temperature
number
default:
0.7

Determines the degree of randomness in the response.

top_k
number
default:
50
top_p
number
default:
0.7

The top_p (nucleus) parameter is used to dynamically adjust the number of choices for each predicted token based on the cumulative probabilities.

Response

200 - application/json
choices
object[]
choices.finish_reason
enum<string>
Available options:
stop,
eos,
length,
tool_calls
choices.message
object
choices.message.content
string
choices.message.role
string
created
integer
id
string
model
string
object
enum<string>
Available options:
chat.completion
tool_calls
object[]

The tool calls generated by the model, such as function calls.

tool_calls.function
object
required

The function that the model called.

tool_calls.function.arguments
string
required

The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

tool_calls.function.name
string
required

The name of the function to call.

tool_calls.id
string
required

The ID of the tool call.

tool_calls.type
enum<string>
required

The type of the tool. Currently, only function is supported.

Available options:
function
usage
object
usage.completion_tokens
integer
usage.prompt_tokens
integer
usage.total_tokens
integer