Documentation Index
Fetch the complete documentation index at: https://docs.siliconflow.cn/llms.txt
Use this file to discover all available pages before exploring further.
LazyLLM is an open-source low-code large-model application development tool developed by SenseTime’s LazyAGI team. It provides one-stop tooling—from application building, data preparation, model deployment, fine-tuning to evaluation—so you can rapidly build AI applications at very low cost and continuously iterate to improve results.
API Signup & Environment Setup
1. Create an account
2. Environment setup
See docs: Quick Start – LazyLLM
API Usage Test
###0. Set Environment Variables
You can use the following command to set the corresponding environment variable, or explicitly pass it in code:
export LAZYLLM_SILICONFLOW_API_KEY=<your API key>
1. Implement chat and image recognition
Text Q&A demo
After filling in the api_key, run the code below to quickly call the model and generate a Q&A-style front-end interface:
import lazyllm
from lazyllm import OnlineChatModule,WebModule
api_key = 'sk-' #replace with your issued API key
# # Test the chat module
llm = OnlineChatModule(source='siliconflow', api_key=api_key, stream=False)
w = WebModule(llm, port=8846, title="siliconflow")
w.start().wait()
We ask “What is LazyLLM?”, and the result is as follows:
Multimodal Q&A demo
Pass an image through the lazyllm_files parameter in the input and ask about the image content to achieve multimodal question answering.
import lazyllm
from lazyllm import OnlineChatModule
api_key = 'sk-' # replace with your issued API key
llm = OnlineChatModule(source='siliconflow', api_key=api_key,
model='Qwen/Qwen2.5-VL-72B-Instruct')
print(llm('Hello, what is this?', lazyllm_files=['your_picture.png']))
Here we use this image to test multimodal Q&A:
Output in the command line:
2. Implement text-to-image and text-to-speech
Use OnlineMultiModalModulefor text-to-image and text-to-speech. After running, it will output the path of the generated files.
import lazyllm
from lazyllm import OnlineMultiModalModule
api_key = 'sk-xxx'
# Test text-to-image function=text2image
llm =OnlineMultiModalModule(source='siliconflow',api_key=api_key,function='text2image')
print(llm("Generate a cute puppy"))
# Test text-to-speech function=tts
llm = OnlineMultiModalModule(source='siliconflow',api_key=api_key,function='tts')
print(llm("Hello, what is your name?",voice='fnlp/MOSS-TTSD-v0.5:anna'))
Run result:
The generated audio is as follows:
| tmpck44zfds.mp3 | 55.13 KB | 2025-10-27 23:13 |
|---|
3. Knowledge-base Q&A in 10+ lines of code
Implement Embed and Rerank functions
Run the code below to perform vector embeddings with OnlineEmbeddingModule; settype='rerank'to call a reranking model.
import lazyllm
from lazyllm import OnlineEmbeddingModule
api_key = 'sk-'
# Test the embed module
llm = OnlineEmbeddingModule(source='siliconflow', api_key=api_key)
print(llm("apple"))
# Test the rerank module
llm = OnlineEmbeddingModule(source='siliconflow', api_key=api_key, type='rerank')
print(llm(["apple", ['apple','banana','orange']]))
The vectorized result is as follows:
[-0.0024823144, -0.0075530247, -0.013154144, -0.031351723, -0.024489744, 0.009692847, 0.008086464, -0.037946977, 0.013251133, -0.046675995, -0.011390155, -0.011111312, 0.016779112, 0.054168403, 0.04849454, 0.014742341, 0.02341074, -0.015542501, 0.059939254, -0.024223024, 0.0065467632, -0.041244607, -0.022925794, -0.024804957, 0.006752865, -0.047548898, -0.03685585, 0.0513557...., -0.070656545, -0.01997975, 0.023398615, 0.008735079]
The word similarity scores are as follows:
[{'index': 0, 'relevance_score': 0.9946065545082092}, {'index': 2, 'relevance_score': 0.014802767895162106}, {'index': 1, 'relevance_score': 0.0004139931406825781}]
Knowledge-base import
We use Chinese classical texts as an example knowledge base. After downloading, place them in the database folder. Sample dataset download link: Sample Dataset Download
First define the embed model, then use LazyLLM’s Document component to create a document management module for importing the knowledge base.
import lazyllm
api_key='sk-'
embed_model = lazyllm.OnlineEmbeddingModule(source="siliconflow", api_key=api_key)
documents = lazyllm.Document(
dataset_path = "database",
embed = embed_model
)
Knowledge-base retrieval
Now that we have an external knowledge base, we can use the Retriever component in LazyLLM to retrieve the knowledge base and recall relevant content.
Usage example:
import lazyllm
from lazyllm.tools import Retriever, Document, SentenceSplitter
api_key='sk-'
embed_model = lazyllm.OnlineEmbeddingModule(source="siliconflow", api_key=api_key)
documents = Document(dataset_path='database', embed=embed_model, manager=False)
rm = Retriever(documents, group_name='CoarseChunk', similarity='bm25', similarity_cut_off=0.01, topk=6)
rm.start()
print(rm("user query"))
Knowledge-base Q&A
Combining the above model, document management, and retrieval modules, we can build a complete dataflow using LazyLLM’s built-in Flow component. The full code is as follows:
import lazyllm
from lazyllm import (
OnlineEmbeddingModule, OnlineChatModule, Document, SentenceSplitter,
Retriever, Reranker, ChatPrompter, pipeline
)
# Initialize api key and prompt
api_key = 'sk-'
prompt = """
You will play the role of an AI Q&A assistant and complete a dialogue task.
In this task, you need to provide your answer based on the given context and question.
"""
# Initialize models
embed_model = OnlineEmbeddingModule(source="siliconflow", api_key=api_key)
rerank_model = OnlineEmbeddingModule(source="siliconflow", api_key=api_key, type="rerank")
llm = OnlineChatModule(source="siliconflow", api_key=api_key)
# Define the document management module and create node groups
doc = Document(dataset_path="/home/xxx/database", manager=False, embed=embed_model)
doc.create_node_group(name="block", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
doc.create_node_group(name="line", transform=SentenceSplitter, chunk_size=128, chunk_overlap=20, parent="block")
# Build the RAG pipeline (multi-route retrieval → rerank → prompt formatting → LLM answering)
with pipeline() as ppl:
with lazyllm.parallel().sum as ppl.prl:
prl.r1 = Retriever(doc, group_name='line', similarity="cosine", topk=6, target='block')
prl.r2 = Retriever(doc, group_name='block', similarity="cosine", topk=6)
ppl.reranker = Reranker('ModuleReranker', model=rerank_model, output_format='content',
join=True) | bind(query=ppl.input)
ppl.formatter = (lambda context, query: dict(context_str=str(context), query=query)) | bind(query=ppl.input)
ppl.llm = llm.prompt(lazyllm.ChatPrompter(prompt, extra_keys=["context_str"]))
ppl.start()
query = "What is the Way of Heaven?"
print(ppl(query))
You can see that the RAG component successfully retrieves content related to the “Way of Heaven” from the Tao Te Ching and other sources, and passes it to the LLM for answering.