2026-04-24 22:52:36 +08:00
|
|
|
|
"""
|
|
|
|
|
|
模型服务模块(model_services)
|
|
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
提供统一的嵌入、重排和生成式大模型服务获取接口,支持自动降级。
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
---
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
## 📚 生成式大模型服务(Chat)
|
|
|
|
|
|
|
|
|
|
|
|
### 双模型服务
|
|
|
|
|
|
| 函数 | 说明 |
|
|
|
|
|
|
|------|------|
|
|
|
|
|
|
| `get_chat_service()` | 获取大模型服务(用于复杂推理、生成) |
|
|
|
|
|
|
| `get_small_llm_service()` | 获取轻量级模型服务(用于简单意图分类、快速问答) |
|
|
|
|
|
|
| `get_all_chat_services()` | 获取所有可用的生成式大模型服务(用于多模型切换) |
|
|
|
|
|
|
|
|
|
|
|
|
### 使用方法
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
|
from app.model_services import get_chat_service, get_small_llm_service
|
|
|
|
|
|
|
|
|
|
|
|
# 获取大模型服务(复杂任务)
|
|
|
|
|
|
llm = get_chat_service()
|
|
|
|
|
|
response = llm.invoke("什么是 LangGraph?")
|
|
|
|
|
|
|
|
|
|
|
|
# 获取轻量级模型服务(简单任务)
|
|
|
|
|
|
small_llm = get_small_llm_service()
|
|
|
|
|
|
response = small_llm.invoke("分类用户意图:'你好'")
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 📚 嵌入模型服务(Embedding)
|
|
|
|
|
|
|
|
|
|
|
|
| 函数 | 说明 |
|
|
|
|
|
|
|------|------|
|
|
|
|
|
|
| `get_embedding_service()` | 获取嵌入模型服务(自动降级) |
|
|
|
|
|
|
|
|
|
|
|
|
### 使用方法
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
|
from app.model_services import get_embedding_service
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
|
|
|
|
|
# 获取嵌入服务(LangChain 兼容的 Embeddings)
|
|
|
|
|
|
embeddings = get_embedding_service()
|
2026-05-03 16:53:34 +08:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
|
|
## 📚 重排模型服务(Rerank)
|
|
|
|
|
|
|
|
|
|
|
|
| 函数 | 说明 |
|
|
|
|
|
|
|------|------|
|
|
|
|
|
|
| `get_rerank_service()` | 获取重排模型服务(自动降级) |
|
|
|
|
|
|
|
|
|
|
|
|
### 使用方法
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
|
from app.model_services import get_rerank_service
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
|
|
|
|
|
# 获取重排服务
|
|
|
|
|
|
reranker = get_rerank_service()
|
|
|
|
|
|
sorted_docs = reranker.compress_documents(documents, query, top_n=5)
|
2026-05-03 16:53:34 +08:00
|
|
|
|
```
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
---
|
2026-04-24 22:52:36 +08:00
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
## 🔧 环境变量配置
|
|
|
|
|
|
|
|
|
|
|
|
```env
|
2026-04-24 22:52:36 +08:00
|
|
|
|
# 智谱 API 配置
|
2026-05-03 16:53:34 +08:00
|
|
|
|
ZHIPUAI_API_KEY=***
|
2026-04-24 22:52:36 +08:00
|
|
|
|
ZHIPU_EMBEDDING_MODEL=embedding-3 # 可选:embedding-2、embedding-3
|
|
|
|
|
|
ZHIPU_RERANK_MODEL=rerank-2 # 可选:rerank-1、rerank-2
|
|
|
|
|
|
ZHIPU_API_BASE=https://open.bigmodel.cn/api/paas/v4
|
|
|
|
|
|
|
2026-05-03 16:53:34 +08:00
|
|
|
|
# DeepSeek API 配置(用于大模型)
|
|
|
|
|
|
DEEPSEEK_API_KEY=***
|
|
|
|
|
|
|
2026-04-24 22:52:36 +08:00
|
|
|
|
# 本地 llama.cpp 服务配置(原有配置保持不变)
|
|
|
|
|
|
LLAMACPP_EMBEDDING_URL=http://localhost:port/v1
|
|
|
|
|
|
LLAMACPP_RERANKER_URL=http://localhost:port/v1
|
2026-05-03 16:53:34 +08:00
|
|
|
|
LLAMACPP_API_KEY=***
|
|
|
|
|
|
```
|
2026-04-24 22:52:36 +08:00
|
|
|
|
"""
|