限速详情
模型的具体限速数值
平台目前提供文本生成、图像生成、向量化、重排序和多模态五大类模型。
以下是模型的具体限速:
文本生成
免费模型
文本生成模型名称 | L0~L5 |
---|---|
01-ai/Yi-1.5-9B-Chat-16K | RPM=1K TPM=50K |
01-ai/Yi-1.5-6B-Chat | RPM=1K TPM=50K |
google/gemma-2-9b-it | RPM=1K TPM=50K |
internlm/internlm2_5-7b-chat | RPM=1K TPM=50K |
meta-llama/Meta-Llama-3-8B-Instruct | RPM=1K TPM=50K |
meta-llama/Meta-Llama-3.1-8B-Instruct | RPM=1K TPM=50K |
mistralai/Mistral-7B-Instruct-v0.2 | RPM=1K TPM=50K |
Qwen/Qwen1.5-7B-Chat | RPM=1K TPM=50K |
Qwen/Qwen2-1.5B-Instruct | RPM=1K TPM=50K |
Qwen/Qwen2-7B-Instruct | RPM=1K TPM=50K |
THUDM/chatglm3-6b | RPM=1K TPM=50K |
THUDM/glm-4-9b-chat | RPM=1K TPM=50K |
Vendor-A/Qwen/Qwen2-72B-Instruct | RPM=1K TPM=50K |
付费模型
0-10B(不含)
文本生成模型名称 | L0 | L1 | L2 | L3 | L4 | L5 |
---|---|---|---|---|---|---|
Pro/01-ai/Yi-1.5-6B-Chat | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/01-ai/Yi-1.5-9B-Chat-16K | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/google/gemma-2-9b-it | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/internlm/internlm2_5-7b-chat | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/meta-llama/Meta-Llama-3.1-8B-Instruct | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/meta-llama/Meta-Llama-3-8B-Instruct | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/mistralai/Mistral-7B-Instruct-v0.2 | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/Qwen/Qwen1.5-7B-Chat | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/Qwen/Qwen2-1.5B-Instruct | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/Qwen/Qwen2-7B-Instruct | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
Pro/THUDM/chatglm3-6b | RPM=1K TPM=80K | RPM=1.2K TPM=120K | RPM=2K TPM=160K | RPM=4K TPM=320K | RPM=8K TPM=1000K | RPM=10K TPM=5000K |
10-50B(不含)
文本生成模型名称 | L0 | L1 | L2 | L3 | L4 | L5 |
---|---|---|---|---|---|---|
Pro/THUDM/glm-4-9b-chat | RPM=1K TPM=40K | RPM=1.2K TPM=60K | RPM=2K TPM=80K | RPM=4K TPM=160K | RPM=8K TPM=500K | RPM=10K TPM=2000K |
01-ai/Yi-1.5-34B-Chat-16K | RPM=1K TPM=40K | RPM=1.2K TPM=60K | RPM=2K TPM=80K | RPM=4K TPM=160K | RPM=8K TPM=500K | RPM=10K TPM=2000K |
internlm/internlm2_5-20b-chat | RPM=1K TPM=40K | RPM=1.2K TPM=60K | RPM=2K TPM=80K | RPM=4K TPM=160K | RPM=8K TPM=500K | RPM=10K TPM=2000K |
google/gemma-2-27b-it | RPM=1K TPM=40K | RPM=1.2K TPM=60K | RPM=2K TPM=80K | RPM=4K TPM=160K | RPM=8K TPM=500K | RPM=10K TPM=2000K |
50-200B(不含)
文本生成模型名称 | L0 | L1 | L2 | L3 | L4 | L5 |
---|---|---|---|---|---|---|
deepseek-ai/deepseek-llm-67b-chat | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
meta-llama/Meta-Llama-3.1-70B-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
meta-llama/Meta-Llama-3-70B-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
mistralai/Mixtral-8x7B-Instruct-v0.1 | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
Qwen/Qwen1.5-110B-Chat | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
Qwen/Qwen2-57B-A14B-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
Qwen/Qwen2-72B-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
Qwen/Qwen2-Math-72B-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=30K | RPM=2K TPM=40K | RPM=4K TPM=80K | RPM=8K TPM=250K | RPM=10K TPM=1000K |
deepseek-ai/DeepSeek-Coder-V2-Instruct | RPM=1K TPM=20K | RPM=1.2K TPM=15K | RPM=2K TPM=20K | RPM=4K TPM=40K | RPM=8K TPM=125K | RPM=10K TPM=500K |
200B以上
文本生成模型名称 | L0 | L1 | L2 | L3 | L4 | L5 |
---|---|---|---|---|---|---|
deepseek-ai/DeepSeek-V2-Chat | RPM=1K TPM=10K | RPM=1.2K TPM=15K | RPM=2K TPM=20K | RPM=4K TPM=40K | RPM=8K TPM=125K | RPM=10K TPM=500K |
meta-llama/Meta-Llama-3.1-405B-Instruct | RPM=1K TPM=10K | RPM=1.2K TPM=15K | RPM=2K TPM=20K | RPM=4K TPM=40K | RPM=8K TPM=125K | RPM=10K TPM=500K |
向量化
限时免费。固定限速(RPM 2000,TPM 500000)。下图是目前平台提供的向量模型限速表。
向量模型名称 | L0~L5 |
---|---|
BAAI/bge-m3 | RPM=2K TPM=500K |
BAAI/bge-large-en-v1.5 | RPM=2K TPM=500K |
BAAI/bge-large-zh-v1.5 | RPM=2K TPM=500K |
netease-youdao/bce-embedding-base_v1 | RPM=2K TPM=500K |
重排序
限时免费。固定限速(RPM 2000,TPM 500000)。下图是目前平台提供的重排序模型限速表。
重排序模型名称 | L0~L5 |
---|---|
BAAI/bge-reranker-v2-m3 | RPM=2K TPM=500K |
netease-youdao/bce-reranker-base_v1 | RPM=2K TPM=500K |
图像生成
免费模型
图像生成模型名称 | L0~L5 |
---|---|
black-forest-labs/FLUX.1-dev | IPM=2 IPD=400 |
black-forest-labs/FLUX.1-schnell | IPM=2 IPD=400 |
ByteDance/SDXL-Lightning | IPM=2 IPD=400 |
InstantX/InstantID | IPM=2 IPD=400 |
stabilityai/stable-diffusion-xl-base-1.0 | IPM=2 IPD=400 |
stabilityai/stable-diffusion-2-1 | IPM=2 IPD=400 |
stabilityai/sdxl-turbo | IPM=2 IPD=400 |
stabilityai/sd-turbo | IPM=2 IPD=400 |
stabilityai/stable-diffusion-3-medium | IPM=2 IPD=400 |
TencentARC/PhotoMaker | IPM=2 IPD=400 |
收费模型
图像生成模型名称 | L0 | L1 | L2 | L3 | L4 | L5 |
---|---|---|---|---|---|---|
black-forest-labs/FLUX.1-dev | IPM=2 IPD=2880 | IPM=4 IPD=5760 | IPM=10 IPD=14400 | IPM=2 IPD=28800 | IPM=40 IPD=57600 | IPM=100 IPD=144000 |
多模态
限时免费。下图是目前平台提供的多模态模型名称列表。
多模态模型名称 | L0~L5 |
---|---|
THUDM/CogVideoX-2b | / |
iic/SenseVoiceSmall | / |
Updated 5 days ago