POST
/
rerank
curl --request POST \
  --url https://api.siliconflow.cn/v1/rerank \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "BAAI/bge-reranker-v2-m3",
  "query": "Apple",
  "documents": [
    "apple",
    "banana",
    "fruit",
    "vegetable"
  ],
  "top_n": 4,
  "return_documents": false,
  "max_chunks_per_doc": 1024,
  "overlap_tokens": 80
}'
{
  "id": "<string>",
  "results": [
    {
      "document": {
        "text": "<string>"
      },
      "index": 123,
      "relevance_score": 123
    }
  ],
  "tokens": {
    "input_tokens": 123,
    "output_tokens": 123
  }
}

Authorizations

Authorization
string
header
required

Use the following format for authentication: Bearer <your api key>

Body

application/json
model
enum<string>
required

Corresponding Model Name. To better enhance service quality, we will make periodic changes to the models provided by this service, including but not limited to model on/offlining and adjustments to model service capabilities. We will notify you of such changes through appropriate means such as announcements or message pushes where feasible.

Available options:
BAAI/bge-reranker-v2-m3,
netease-youdao/bce-reranker-base_v1
Example:

"BAAI/bge-reranker-v2-m3"

query
string
default:Apple
required

Required. The search query.

Example:

"Apple"

documents
string[]
required

Currently, only string lists are supported. Document objects will be supported in the future.

Example:
["apple", "banana", "fruit", "vegetable"]
top_n
integer

Number of most relevant documents or indices to return.

Example:

4

return_documents
boolean
default:false

If false, the response does not include document text; if true, it includes the input document text.

max_chunks_per_doc
integer
default:1024

Maximum number of chunks generated from within a document. Long documents are divided into multiple chunks for calculation, and the highest score among the chunks is taken as the document's score.

overlap_tokens
integer
default:80

Number of token overlaps between adjacent chunks when documents are chunked.

Required range: x <= 80

Response

200
application/json
200
id
string
required
results
object[]
required
tokens
object
required