1.About DB-GPT

DB-GPT is an open-source AI-native data framework with AWEL (Agentic Workflow Expression Language) and Agents.

The goal is to build infrastructure for the large model domain, through the development of multiple capabilities such as Multi-Model Management (SMMF), Text2SQL performance optimization, RAG framework and optimization, multi-agents framework collaboration, and AWEL (intelligent orchestration). This makes it easier and more convenient to build data model applications around databases.

2.Obtain API key

2.1. Open the SiliconCloud website and register an account (if you have an existing account, log in directly).

2.2. After completing the registration, go to the API Key page to create API the future use.

3.Deploy DB-GPT

3.1 Clone the DB-GPT Source Code

git clone https://github.com/eosphoros-ai/DB-GPT.git

3.2 Create a Virtual Environment and Install Dependencies

# cd 到 DB-GPT 源码根目录
cd DB-GPT

# DB-GPT 要求python >= 3.10
conda create -n dbgpt_env python=3.10
conda activate dbgpt_env

# 这里选择代理模型类依赖安装
pip install -e ".[proxy]"

3.3 Configure basic environment variables

# 复制模板 env 文件为 .env
cp .env.template .env

3.4 Modify the environment variables file .env to configure siliconCloud models

#  使用 SiliconCloud 的代理模型
LLM_MODEL=siliconflow_proxyllm
# 配置具体使用的模型名称
SILICONFLOW_MODEL_VERSION=Qwen/Qwen2.5-Coder-32B-Instruct
SILICONFLOW_API_BASE=https://api.siliconflow.cn/v1
# 记得填写您在步骤2中获取的 API Key
SILICONFLOW_API_KEY={your-siliconflow-api-key}

# 配置使用 SiliconCloud 的 Embedding 模型
EMBEDDING_MODEL=proxy_http_openapi
PROXY_HTTP_OPENAPI_PROXY_SERVER_URL=https://api.siliconflow.cn/v1/embeddings
# 记得填写您在步骤2中获取的 API Key
PROXY_HTTP_OPENAPI_PROXY_API_KEY={your-siliconflow-api-key}
# 配置具体的 Embedding 模型名称
PROXY_HTTP_OPENAPI_PROXY_BACKEND=BAAI/bge-large-zh-v1.5


# 配置使用 SiliconCloud 的 rerank 模型
RERANK_MODEL=rerank_proxy_siliconflow
RERANK_PROXY_SILICONFLOW_PROXY_SERVER_URL=https://api.siliconflow.cn/v1/rerank
# 记得填写您在步骤2中获取的 API Key
RERANK_PROXY_SILICONFLOW_PROXY_API_KEY={your-siliconflow-api-key}
# 配置具体的 rerank 模型名称
RERANK_PROXY_SILICONFLOW_PROXY_BACKEND=BAAI/bge-reranker-v2-m3

Note: The SILICONFLOW_API_KEY, PROXY_HTTP_OPENAPI_PROXY_SERVER_URL, and RERANK_PROXY_SILICONFLOW_PROXY_API_KEY environment variables are the API Keys you obtained in step 2. The language model (SILICONFLOW), Embedding model (PROXY_HTTP_OPENAPI_PROXY_BACKEND), and rerank model (RERANK_PROXYICONFLOW_PROXY_BACKEND) can be obtained from Get User Models List - SiliconFlow.

3.5 Start DB-GPT service

dbgpt start webserver --port 5670

Open the address http://127.0.0.1:5670/ in your browser to access the deployed DB-GPT.

4.Use SiliconCloud models via DB-GPT python SDK

4.1 Install the DB-GPT Python Package

pip install "dbgpt>=0.6.3rc2" openai requests numpy

To facilitate subsequent verification, install the related dependencies additionally.

4.2. Use the large language model from SiliconCloud

import asyncio
import os
from dbgpt.core import ModelRequest
from dbgpt.model.proxy import SiliconFlowLLMClient

model = "Qwen/Qwen2.5-Coder-32B-Instruct"
client = SiliconFlowLLMClient(
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_alias=model
)

res = asyncio.run(
    client.generate(
        ModelRequest(
            model=model,
            messages=[
                {"role": "system", "content": "你是一个乐于助人的 AI 助手。"},
                {"role": "human", "content": "你好"},
            ]
        )
    )
)
print(res)

4.3 Use the embedding model from SiliconCloud

import os
from dbgpt.rag.embedding import OpenAPIEmbeddings

openai_embeddings = OpenAPIEmbeddings(
    api_url="https://api.siliconflow.cn/v1/embeddings",
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_name="BAAI/bge-large-zh-v1.5",
)

texts = ["Hello, world!", "How are you?"]
res = openai_embeddings.embed_documents(texts)
print(res)

4.4 Use the rerank model from SiliconCloud

import os
from dbgpt.rag.embedding import SiliconFlowRerankEmbeddings

embedding = SiliconFlowRerankEmbeddings(
    api_key=os.getenv("SILICONFLOW_API_KEY"),
    model_name="BAAI/bge-reranker-v2-m3",
)
res = embedding.predict("Apple", candidates=["苹果", "香蕉", "水果", "蔬菜"])
print(res)

5. Hands-on guide

For a data conversation example, data conversation capabilities involve natural language interaction with structured and semi-structured data, which can assist in data analysis and insights. Below are the specific operational steps:

1. Add data sources

First, select the data on the left side. Add a database, currently, DB-GPT supports various database Choose the corresponding database type to add. Here, we use MySQL as a demonstration. The test data for the demonstration can be found in the [test examples](https://github.com/eosphoros-ai/DB-GPT/tree/main/docker/examples/sqls).

2. Select conversation type

Select the ChatData conversation type.

3. Start data conversation

Note: When conversing, select the corresponding model and database. DB-GPT also provides preview and edit modes.

编辑模式: