This commit is contained in:
548
README.md
548
README.md
@@ -49,21 +49,20 @@
|
||||
- ✅ **向后兼容**:可通过 use_hybrid_router=True/False 切换混合路由/纯 React 模式
|
||||
|
||||
---
|
||||
好的,以下是根据我们讨论优化后的完整架构部分。你直接粘贴到 README 中即可使用。
|
||||
|
||||
```markdown
|
||||
## 🏗️ 技术架构
|
||||
|
||||
### 技术栈总览
|
||||
### 1. 技术栈总览
|
||||
|
||||
| 层级 | 组件 | 技术选型 | 说明 |
|
||||
|------|------|---------|------|
|
||||
| **Agent 框架** | 工作流编排 | LangGraph + LangChain | 状态机驱动的智能体工作流 |
|
||||
| **主图系统** | 主流程 | main_graph/ | 混合路由 + React 循环 + 工具执行 |
|
||||
| **子图系统** | 模块化子图 | subgraphs/ | 通讯录、词典、资讯分析等子图 |
|
||||
| | 核心工具 | core/ | 意图理解、格式化输出、人工审核、联网搜索、可视化图表 |
|
||||
| **向量数据库** | 向量检索 | Qdrant | 高性能向量相似度检索(远程服务器) |
|
||||
| **后端框架** | API 服务 | FastAPI + Uvicorn | RESTful API + SSE 流式输出 |
|
||||
| **前端框架** | Web 界面 | Streamlit | 交互式对话界面 |
|
||||
| **关系数据库** | 持久化存储 | PostgreSQL | 对话记忆持久化(远程服务器) |
|
||||
| **向量数据库** | 向量检索 | Qdrant | 高性能向量相似度检索(远程服务器) |
|
||||
| **容器化** | 服务编排 | Docker + Docker Compose | 一键部署所有服务 |
|
||||
| **CI/CD** | 自动化部署 | Gitea Workflows | 代码推送自动构建部署 |
|
||||
| **LLM 服务** | 云端模型 | 智谱 AI (glm-4-plus) | 快速响应,适合日常对话 |
|
||||
@@ -75,267 +74,245 @@
|
||||
| | Rerank 服务 | rerank_services.py | 统一的重排序接口 |
|
||||
| **Embedding** | 向量嵌入 | llama.cpp server | 本地 embedding 服务 (:18001) |
|
||||
|
||||
### 系统架构流程图
|
||||
---
|
||||
|
||||
### 2. 系统全景图
|
||||
|
||||
展示系统各组件之间的高层交互关系,隐藏执行细节。
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
User[用户浏览器] --> Frontend[Streamlit 前端 :8501]
|
||||
Frontend --> Backend[FastAPI 后端 :8079]
|
||||
subgraph UserLayer["👤 用户层"]
|
||||
Browser["浏览器"]
|
||||
Streamlit["Streamlit 前端<br/>:8501"]
|
||||
end
|
||||
|
||||
Backend --> AgentService[AIAgentService]
|
||||
subgraph AppLayer["⚙️ 应用服务层"]
|
||||
FastAPI["FastAPI 后端<br/>:8079"]
|
||||
Agent["AIAgentService<br/>(智能体协调)"]
|
||||
end
|
||||
|
||||
AgentService --> LangGraph[LangGraph 工作流引擎]
|
||||
subgraph EngineLayer["🧠 智能体引擎"]
|
||||
LangGraph["LangGraph 工作流引擎<br/>路由 / React / 工具调用"]
|
||||
end
|
||||
|
||||
LangGraph --> RetrieveMemory[记忆检索 retrieve_memory]
|
||||
LangGraph --> MemoryTrigger[记忆触发 memory_trigger]
|
||||
LangGraph --> InitState[初始化状态 init_state]
|
||||
LangGraph --> HybridRouter[混合路由 hybrid_router]
|
||||
LangGraph --> ReactLoop[React 循环 react_loop]
|
||||
LangGraph --> FastPath[快速路径 fast_*]
|
||||
LangGraph --> LLMCall[LLM 调用 llm_call]
|
||||
LangGraph --> Summarize[记忆摘要 summarize]
|
||||
LangGraph --> Finalize[最终处理 finalize]
|
||||
subgraph ServicesLayer["🧩 领域服务"]
|
||||
RAG["RAG 检索服务"]
|
||||
Tools["工具集<br/>(搜索、通讯录、词典等)"]
|
||||
end
|
||||
|
||||
HybridRouter --> FastChitchat[fast_chitchat]
|
||||
HybridRouter --> FastRAG[fast_rag]
|
||||
HybridRouter --> FastTool[fast_tool]
|
||||
HybridRouter --> ReactLoop
|
||||
subgraph ModelLayer["🤖 模型层"]
|
||||
LLM["LLM 服务<br/>(多模型降级链)"]
|
||||
Embedding["Embedding 服务<br/>:18001"]
|
||||
Rerank["Rerank 服务<br/>:18002"]
|
||||
end
|
||||
|
||||
ReactLoop --> ReactReason[react_reason 推理节点]
|
||||
ReactLoop --> RAGRetrieve[rag_retrieve]
|
||||
ReactLoop --> WebSearch[web_search]
|
||||
ReactLoop --> ContactSubgraph[contact_subgraph]
|
||||
ReactLoop --> DictionarySubgraph[dictionary_subgraph]
|
||||
ReactLoop --> NewsSubgraph[news_analysis_subgraph]
|
||||
ReactLoop --> HandleError[handle_error]
|
||||
subgraph DataLayer["💾 数据层"]
|
||||
Qdrant["Qdrant 向量库"]
|
||||
PostgreSQL["PostgreSQL"]
|
||||
end
|
||||
|
||||
RAGRetrieve --> Qdrant[Qdrant向量库]
|
||||
RAGRetrieve --> RerankService[Rerank服务]
|
||||
RAGRetrieve --> EmbeddingService[Embedding服务]
|
||||
Browser --> Streamlit
|
||||
Streamlit --> FastAPI
|
||||
FastAPI --> Agent
|
||||
Agent --> LangGraph
|
||||
LangGraph --> RAG
|
||||
LangGraph --> Tools
|
||||
LangGraph --> LLM
|
||||
RAG --> Embedding
|
||||
RAG --> Rerank
|
||||
RAG --> Qdrant
|
||||
Agent --> PostgreSQL
|
||||
LangGraph --> PostgreSQL
|
||||
|
||||
AgentService --> ChatServices[模型服务层 chat_services]
|
||||
ChatServices --> FallbackChain[FallbackServiceChain]
|
||||
FallbackChain --> Zhipu[智谱 GLM-4]
|
||||
FallbackChain --> DeepSeek[DeepSeek V3]
|
||||
FallbackChain --> OpenAI[OpenAI GPT-4o]
|
||||
FallbackChain --> LocalQwen[本地 Qwen3.5-9B]
|
||||
|
||||
RetrieveMemory --> PostgreSQL[PostgreSQL]
|
||||
Summarize --> PostgreSQL
|
||||
|
||||
style User fill:#e1f5ff
|
||||
style Frontend fill:#fff4e1
|
||||
style Backend fill:#e8f5e9
|
||||
style HybridRouter fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
||||
style ReactLoop fill:#f3e5f5
|
||||
style FastPath fill:#e3f2fd
|
||||
style LangGraph fill:#c8e6c9
|
||||
style ChatServices fill:#c8e6c9
|
||||
style PostgreSQL fill:#ffebee
|
||||
style Qdrant fill:#ffebee
|
||||
style UserLayer fill:#e1f5ff
|
||||
style AppLayer fill:#fff4e1
|
||||
style EngineLayer fill:#ffe0b2
|
||||
style ServicesLayer fill:#e8f5e9
|
||||
style ModelLayer fill:#f3e5f5
|
||||
style DataLayer fill:#ffebee
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 主图与子图架构
|
||||
### 3. 智能体引擎(主图)
|
||||
|
||||
#### 3.1 核心流程
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "主图 MainGraph"
|
||||
StartMain[START]
|
||||
RetrieveMemory[记忆检索]
|
||||
MemoryTrigger[记忆触发]
|
||||
InitState[初始化状态]
|
||||
HybridRouter[混合路由]
|
||||
FastChitchat[fast_chitchat]
|
||||
FastRAG[fast_rag]
|
||||
FastTool[fast_tool]
|
||||
ReactReason[react_reason]
|
||||
LLMCall[llm_call]
|
||||
FinalMain[最终响应]
|
||||
EndMain[END]
|
||||
Start([开始]) --> Retrieve["检索长期记忆"]
|
||||
Retrieve --> Trigger["记忆触发"]
|
||||
Trigger --> Init["初始化状态"]
|
||||
Init --> Route{"混合路由<br/>hybrid_router"}
|
||||
|
||||
StartMain --> RetrieveMemory
|
||||
RetrieveMemory --> MemoryTrigger
|
||||
MemoryTrigger --> InitState
|
||||
InitState --> HybridRouter
|
||||
Route -->|闲聊/简单问答| Fast["⚡ 快速路径<br/>(fast_*)"]
|
||||
Route -->|知识检索| FastRAG["⚡ 快速 RAG"]
|
||||
Route -->|工具调用| FastTool["⚡ 快速工具"]
|
||||
Route -->|复杂任务| React["🔄 React 循环"]
|
||||
|
||||
HybridRouter --> FastChitchat
|
||||
HybridRouter --> FastRAG
|
||||
HybridRouter --> FastTool
|
||||
HybridRouter --> ReactReason
|
||||
React --> Reason["推理"]
|
||||
Reason --> Action["选择行动"]
|
||||
Action -->|需要检索| RAGNode["RAG 检索"]
|
||||
Action -->|搜索| Web["联网搜索"]
|
||||
Action -->|通讯录| Contact["通讯录子图"]
|
||||
Action -->|词典| Dict["词典子图"]
|
||||
Action -->|资讯| News["资讯子图"]
|
||||
Action -->|LLM| LLMCall["LLM 生成"]
|
||||
|
||||
FastChitchat --> LLMCall
|
||||
FastChitchat -.-> ReactReason
|
||||
FastRAG --> LLMCall
|
||||
FastRAG -.-> ReactReason
|
||||
FastTool --> LLMCall
|
||||
FastTool -.-> ReactReason
|
||||
RAGNode --> Observe["观察结果"]
|
||||
Web --> Observe
|
||||
Contact --> Observe
|
||||
Dict --> Observe
|
||||
News --> Observe
|
||||
LLMCall --> Observe
|
||||
Observe -->|未完成| Reason
|
||||
Observe -->|完成| Final["生成最终回复"]
|
||||
|
||||
ReactReason --> RAGRetrieve[RAG检索]
|
||||
ReactReason --> WebSearchNode[联网搜索]
|
||||
ReactReason --> ContactNode[通讯录子图]
|
||||
ReactReason --> DictNode[词典子图]
|
||||
ReactReason --> NewsNode[资讯子图]
|
||||
ReactReason --> LLMCall
|
||||
Fast --> Final
|
||||
FastRAG --> Final
|
||||
FastTool --> Final
|
||||
Final --> End([结束])
|
||||
|
||||
RAGRetrieve --> ReactReason
|
||||
WebSearchNode --> ReactReason
|
||||
ContactNode --> ReactReason
|
||||
DictNode --> ReactReason
|
||||
NewsNode --> ReactReason
|
||||
|
||||
LLMCall --> FinalMain
|
||||
FinalMain --> EndMain
|
||||
end
|
||||
|
||||
subgraph "通讯录子图 ContactSubgraph"
|
||||
StartContact[START]
|
||||
IntentContact[parse_intent]
|
||||
ListContacts[list_contacts]
|
||||
AddContact[add_contact]
|
||||
ListEmails[list_emails]
|
||||
GenEmail[generate_email_draft]
|
||||
HumanReview[human_review]
|
||||
SendEmail[send_email]
|
||||
SniffContact[sniff_contacts]
|
||||
FormatContact[format_result]
|
||||
EndContact[END]
|
||||
|
||||
StartContact --> IntentContact
|
||||
IntentContact --> ListContacts
|
||||
IntentContact --> AddContact
|
||||
IntentContact --> ListEmails
|
||||
IntentContact --> GenEmail
|
||||
IntentContact --> SniffContact
|
||||
ListContacts --> FormatContact
|
||||
AddContact --> FormatContact
|
||||
ListEmails --> FormatContact
|
||||
SniffContact --> FormatContact
|
||||
GenEmail --> HumanReview
|
||||
HumanReview --> SendEmail
|
||||
HumanReview --> FormatContact
|
||||
SendEmail --> FormatContact
|
||||
FormatContact --> EndContact
|
||||
end
|
||||
|
||||
subgraph "词典子图 DictionarySubgraph"
|
||||
StartDict[START]
|
||||
IntentDict[parse_intent]
|
||||
QueryWord[query_word]
|
||||
Translate[translate_text]
|
||||
ExtractTerms[extract_terms]
|
||||
DailyWord[get_daily_word]
|
||||
LookupWord[lookup_word_book]
|
||||
AddToWord[add_to_word_book]
|
||||
FormatDict[format_result]
|
||||
EndDict[END]
|
||||
|
||||
StartDict --> IntentDict
|
||||
IntentDict --> QueryWord
|
||||
IntentDict --> Translate
|
||||
IntentDict --> ExtractTerms
|
||||
IntentDict --> DailyWord
|
||||
IntentDict --> LookupWord
|
||||
IntentDict --> AddToWord
|
||||
QueryWord --> FormatDict
|
||||
Translate --> FormatDict
|
||||
ExtractTerms --> FormatDict
|
||||
DailyWord --> FormatDict
|
||||
LookupWord --> FormatDict
|
||||
AddToWord --> FormatDict
|
||||
FormatDict --> EndDict
|
||||
end
|
||||
|
||||
subgraph "资讯分析子图 NewsSubgraph"
|
||||
StartNews[START]
|
||||
IntentNews[parse_intent]
|
||||
QueryNews[query_news]
|
||||
AnalyzeUrl[analyze_url]
|
||||
ExtractKeywords[extract_keywords]
|
||||
GenReport[generate_report]
|
||||
FormatNews[format_result]
|
||||
EndNews[END]
|
||||
|
||||
StartNews --> IntentNews
|
||||
IntentNews --> QueryNews
|
||||
IntentNews --> AnalyzeUrl
|
||||
IntentNews --> ExtractKeywords
|
||||
IntentNews --> GenReport
|
||||
QueryNews --> FormatNews
|
||||
AnalyzeUrl --> FormatNews
|
||||
ExtractKeywords --> FormatNews
|
||||
GenReport --> FormatNews
|
||||
FormatNews --> EndNews
|
||||
end
|
||||
|
||||
ReactReason -.-> StartContact
|
||||
ReactReason -.-> StartDict
|
||||
ReactReason -.-> StartNews
|
||||
|
||||
style HybridRouter fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
||||
style ReactReason fill:#e8eaf6
|
||||
style Route fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
||||
style React fill:#f3e5f5,stroke:#7e57c2,stroke-width:2px
|
||||
```
|
||||
|
||||
#### 3.2 路由策略
|
||||
|
||||
- **闲聊 / 简单问答** → `fast_chitchat`:直接调用 LLM 生成回复
|
||||
- **知识检索** → `fast_rag`:RAG 检索后生成回复
|
||||
- **工具调用** → `fast_tool`:直接执行原子工具操作
|
||||
- **复杂任务** → `React 循环`:推理 → 行动 → 观察的多轮迭代
|
||||
|
||||
#### 3.3 React 循环详解
|
||||
|
||||
当任务需要多步推理时,引擎进入 React 循环:
|
||||
|
||||
1. **推理**:LLM 分析当前状态,决定下一步行动
|
||||
2. **行动**:执行工具调用(RAG 检索、联网搜索、子图操作等)
|
||||
3. **观察**:收集行动结果,更新状态
|
||||
4. 若任务未完成,回到步骤 1;否则退出循环,生成最终回复
|
||||
|
||||
---
|
||||
|
||||
### 索引工作流(离线构建)
|
||||
### 4. 子图系统
|
||||
|
||||
每个子图都是独立的模块化工作流,由主图的 React 循环按需调用。
|
||||
|
||||
#### 4.1 通讯录子图
|
||||
|
||||
支持联系人管理、邮件草稿生成与人工审核发送。
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
Start([START]) --> Intent["parse_intent"]
|
||||
Intent --> List["list_contacts"]
|
||||
Intent --> Add["add_contact"]
|
||||
Intent --> EmailList["list_emails"]
|
||||
Intent --> GenEmail["generate_email_draft"]
|
||||
Intent --> Sniff["sniff_contacts"]
|
||||
GenEmail --> Human["human_review"]
|
||||
Human --> Send["send_email"]
|
||||
Send --> Format["format_result"]
|
||||
List --> Format
|
||||
Add --> Format
|
||||
EmailList --> Format
|
||||
Sniff --> Format
|
||||
Format --> End([END])
|
||||
```
|
||||
|
||||
#### 4.2 词典子图
|
||||
|
||||
支持单词查询、翻译、术语提取、每日单词、单词本管理。
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
Start([START]) --> Intent["parse_intent"]
|
||||
Intent --> Query["query_word"]
|
||||
Intent --> Trans["translate_text"]
|
||||
Intent --> Terms["extract_terms"]
|
||||
Intent --> Daily["get_daily_word"]
|
||||
Intent --> Lookup["lookup_word_book"]
|
||||
Intent --> AddWord["add_to_word_book"]
|
||||
Query --> Format["format_result"]
|
||||
Trans --> Format
|
||||
Terms --> Format
|
||||
Daily --> Format
|
||||
Lookup --> Format
|
||||
AddWord --> Format
|
||||
Format --> End([END])
|
||||
```
|
||||
|
||||
#### 4.3 资讯分析子图
|
||||
|
||||
支持新闻检索、URL 分析、关键词提取与报告生成。
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
Start([START]) --> Intent["parse_intent"]
|
||||
Intent --> QueryNews["query_news"]
|
||||
Intent --> AnalyzeUrl["analyze_url"]
|
||||
Intent --> Keywords["extract_keywords"]
|
||||
Intent --> Report["generate_report"]
|
||||
QueryNews --> Format["format_result"]
|
||||
AnalyzeUrl --> Format
|
||||
Keywords --> Format
|
||||
Report --> Format
|
||||
Format --> End([END])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. RAG 系统
|
||||
|
||||
#### 5.1 离线索引
|
||||
|
||||
文档导入、切分、嵌入生成到存入向量库的完整流程。
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph 文档输入
|
||||
A1[文档源]
|
||||
A2[PDF/DOCX/TXT/Markdown]
|
||||
subgraph Input["文档输入"]
|
||||
DocSource["文档源<br/>PDF/DOCX/TXT/Markdown"]
|
||||
end
|
||||
|
||||
subgraph 文档加载
|
||||
B1[rag_indexer/loaders.py]
|
||||
B2[UnstructuredLoader]
|
||||
B3[PyMuPDFLoader]
|
||||
B4[TextLoader]
|
||||
subgraph Load["文档加载"]
|
||||
Loader["Unstructured / PyMuPDF / TextLoader"]
|
||||
end
|
||||
|
||||
subgraph 文本切分
|
||||
C1[rag_indexer/splitters.py]
|
||||
C2[RecursiveCharacterTextSplitter<br/>按分隔符递归切分]
|
||||
C3[SemanticChunker<br/>基于语义相似度]
|
||||
C4[ParentChildSplitter<br/>父子块切分]
|
||||
subgraph Split["文本切分"]
|
||||
Recursive["RecursiveCharacterTextSplitter<br/>按分隔符递归切分"]
|
||||
Semantic["SemanticChunker<br/>基于语义相似度"]
|
||||
ParentChild["ParentChildSplitter<br/>父子块切分"]
|
||||
end
|
||||
|
||||
subgraph 嵌入生成
|
||||
D1[Embedding 生成]
|
||||
D2[稠密向量<br/>Qwen3-Embedding-0.6B<br/>llama.cpp server:18001]
|
||||
D3[稀疏向量 BM25<br/>FastEmbed]
|
||||
subgraph Embed["嵌入生成"]
|
||||
Dense["稠密向量<br/>Qwen3-Embedding-0.6B"]
|
||||
Sparse["稀疏向量 BM25<br/>FastEmbed"]
|
||||
end
|
||||
|
||||
subgraph 向量存储
|
||||
E1[Qdrant Vector Store]
|
||||
E2[稠密向量索引<br/>HNSW 算法]
|
||||
E3[稀疏向量索引<br/>BM25]
|
||||
E4[rag_core/vector_store.py]
|
||||
subgraph Store["向量存储"]
|
||||
QdrantStore["Qdrant<br/>HNSW 索引 + 稀疏索引"]
|
||||
end
|
||||
|
||||
A1 --> A2
|
||||
A2 --> B1
|
||||
B1 --> B2
|
||||
B2 --> C1
|
||||
C1 --> C2
|
||||
C1 --> C3
|
||||
C1 --> C4
|
||||
C2 --> D1
|
||||
C3 --> D1
|
||||
C4 --> D1
|
||||
D1 --> D2
|
||||
D1 --> D3
|
||||
D2 --> E1
|
||||
D3 --> E1
|
||||
E1 --> E2
|
||||
E1 --> E3
|
||||
DocSource --> Loader
|
||||
Loader --> Recursive
|
||||
Loader --> Semantic
|
||||
Loader --> ParentChild
|
||||
Recursive --> Dense
|
||||
Semantic --> Dense
|
||||
ParentChild --> Dense
|
||||
Recursive --> Sparse
|
||||
Semantic --> Sparse
|
||||
ParentChild --> Sparse
|
||||
Dense --> QdrantStore
|
||||
Sparse --> QdrantStore
|
||||
|
||||
style A1 fill:#e3f2fd
|
||||
style B1 fill:#fff3e0
|
||||
style C1 fill:#f3e5f5
|
||||
style D1 fill:#e8f5e9
|
||||
style E1 fill:#ffebee
|
||||
style Input fill:#e3f2fd
|
||||
style Load fill:#fff3e0
|
||||
style Split fill:#f3e5f5
|
||||
style Embed fill:#e8f5e9
|
||||
style Store fill:#ffebee
|
||||
```
|
||||
|
||||
**技术组件说明:**
|
||||
@@ -343,79 +320,110 @@ flowchart TB
|
||||
| 组件 | 技术选型 | 说明 |
|
||||
|------|---------|------|
|
||||
| 文档加载 | Unstructured / PyMuPDF / TextLoader | 支持多种文档格式 |
|
||||
| 文本切分 | RecursiveCharacterTextSplitter | 按分隔符递归切分,默认 500 字符 |
|
||||
| 文本切分 | RecursiveCharacterTextSplitter | 默认 500 字符,按分隔符递归切分 |
|
||||
| 语义切分 | SemanticChunker | 基于 Embedding 相似度自动切分 |
|
||||
| 父子块切分 | ParentChildSplitter | 大块存储上下文,小块用于检索 |
|
||||
| 父子切分 | ParentChildSplitter | 大块存储上下文,小块用于检索 |
|
||||
| 稠密嵌入 | Qwen3-Embedding-0.6B-Q8_0 | llama.cpp server (:18001) |
|
||||
| 稀疏嵌入 | FastEmbed BM25 | 本地计算,无需额外服务 |
|
||||
| 向量存储 | Qdrant | HNSW 索引,高性能 ANN 检索 |
|
||||
|
||||
### 检索工作流(在线查询)
|
||||
#### 5.2 在线检索
|
||||
|
||||
用户查询经过改写、混合检索、融合、重排序,最终由 LLM 生成回答。
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph "查询输入"
|
||||
Q1[用户查询]
|
||||
Q2[公司报销流程]
|
||||
subgraph Input["查询输入"]
|
||||
Query["用户查询"]
|
||||
end
|
||||
|
||||
subgraph "查询处理"
|
||||
R1[查询改写]
|
||||
R2[使用 chat_services]
|
||||
subgraph Processing["查询处理"]
|
||||
Rewrite["查询改写<br/>(LLM 生成多角度查询)"]
|
||||
end
|
||||
|
||||
subgraph "混合检索"
|
||||
S1[并行检索]
|
||||
S2[稠密向量检索]
|
||||
S3[稀疏BM25检索]
|
||||
subgraph Retrieval["混合检索"]
|
||||
Parallel["并行检索"]
|
||||
DenseRet["稠密向量检索"]
|
||||
SparseRet["稀疏 BM25 检索"]
|
||||
end
|
||||
|
||||
subgraph "结果融合"
|
||||
F1[RRF融合]
|
||||
subgraph Fusion["结果融合"]
|
||||
RRF["RRF 融合<br/>(Qdrant 服务端融合)"]
|
||||
end
|
||||
|
||||
subgraph "重排序"
|
||||
P1[Cross-Encoder重排]
|
||||
P2[18002端口]
|
||||
subgraph Rerank["重排序"]
|
||||
CrossEncoder["Cross-Encoder 重排<br/>bge-reranker-v2-m3"]
|
||||
end
|
||||
|
||||
subgraph "LLM生成"
|
||||
G1[LLM生成回答]
|
||||
G2[chat_services]
|
||||
subgraph Generation["生成"]
|
||||
LLMGen["LLM 生成回答"]
|
||||
end
|
||||
|
||||
Q1 --> Q2
|
||||
Q2 --> R1
|
||||
R1 --> R2
|
||||
R2 --> S1
|
||||
S1 --> S2
|
||||
S1 --> S3
|
||||
S2 --> F1
|
||||
S3 --> F1
|
||||
F1 --> P1
|
||||
P1 --> P2
|
||||
P2 --> G1
|
||||
G1 --> G2
|
||||
Query --> Rewrite --> Parallel
|
||||
Parallel --> DenseRet
|
||||
Parallel --> SparseRet
|
||||
DenseRet --> RRF
|
||||
SparseRet --> RRF
|
||||
RRF --> CrossEncoder --> LLMGen
|
||||
|
||||
style Q1 fill:#e3f2fd
|
||||
style R1 fill:#fff3e0
|
||||
style S1 fill:#f3e5f5
|
||||
style F1 fill:#e8f5e9
|
||||
style P1 fill:#ffebee
|
||||
style G1 fill:#fff3e0
|
||||
style Input fill:#e3f2fd
|
||||
style Processing fill:#fff3e0
|
||||
style Retrieval fill:#f3e5f5
|
||||
style Fusion fill:#e8f5e9
|
||||
style Rerank fill:#ffebee
|
||||
style Generation fill:#fff3e0
|
||||
```
|
||||
|
||||
**技术组件说明:**
|
||||
|
||||
| 阶段 | 技术选型 | 说明 |
|
||||
|------|---------|------|
|
||||
| 查询改写 | MultiQuery | 使用 LLM 生成 3-5 个多角度查询 |
|
||||
| 稠密检索 | Qwen3-Embedding | 向量相似度计算,余弦相似度 |
|
||||
| 稀疏检索 | FastEmbed BM25 | 词频 TF-IDF 统计 |
|
||||
| 结果融合 | Qdrant Fusion API | 服务端 RRF 融合,无需传输数据 |
|
||||
| 查询改写 | MultiQuery | LLM 生成 3~5 个多角度查询 |
|
||||
| 稠密检索 | Qwen3-Embedding | 余弦相似度向量检索 |
|
||||
| 稀疏检索 | FastEmbed BM25 | TF-IDF 词频统计检索 |
|
||||
| 结果融合 | Qdrant Fusion API | 服务端 RRF 融合,减少数据传输 |
|
||||
| 重排序 | bge-reranker-v2-m3 | Cross-Encoder 交互编码,精度更高 |
|
||||
| LLM 生成 | chat_services | 统一的大模型服务接口 |
|
||||
|
||||
---
|
||||
|
||||
### 6. 模型服务层
|
||||
|
||||
#### 6.1 多模型降级链
|
||||
|
||||
当首选模型调用失败时,自动依次尝试备用模型,保证服务可用性。
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
Start([API 请求]) --> Zhipu["智谱 GLM-4"]
|
||||
Zhipu -->|失败| DeepSeek["DeepSeek V3"]
|
||||
DeepSeek -->|失败| OpenAI["OpenAI GPT-4o"]
|
||||
OpenAI -->|失败| Local["本地 Qwen3.5-9B"]
|
||||
Local --> Response([返回响应])
|
||||
style Start fill:#e1f5ff
|
||||
style Response fill:#c8e6c9
|
||||
```
|
||||
|
||||
**降级策略:** 云端模型按优先级依次尝试,最终回退到本地模型,确保无单点故障。
|
||||
|
||||
#### 6.2 统一服务接口
|
||||
|
||||
所有模型调用均通过以下三个统一接口访问,上层业务不感知具体模型:
|
||||
|
||||
- **`chat_services`**:对话生成
|
||||
- **`embedding_services`**:文本向量化
|
||||
- **`rerank_services`**:搜索结果重排序
|
||||
|
||||
---
|
||||
|
||||
### 7. 数据存储
|
||||
|
||||
| 存储 | 用途 | 访问方式 |
|
||||
|------|------|---------|
|
||||
| **PostgreSQL** | 对话历史、长期记忆 | 远程服务器,SQLAlchemy ORM |
|
||||
| **Qdrant** | 文档向量、知识库 | 远程服务器,gRPC/HTTP API |
|
||||
```
|
||||
|
||||
### 数据流向图
|
||||
|
||||
```
|
||||
|
||||
@@ -78,6 +78,10 @@ rag_indexer/
|
||||
├── index_builder.py # 索引构建主流水线(自定义父子块实现)
|
||||
├── loaders.py # 文档加载器(多格式支持)
|
||||
├── splitters.py # 文本切分器(递归/语义/父子块)
|
||||
├── config.py # 配置管理
|
||||
├── cli.py # 命令行接口
|
||||
├── clear_qdrant.py # 清空 Qdrant 集合
|
||||
├── reset_qdrant.py # 重置 Qdrant 集合
|
||||
└── README.md # 本文档
|
||||
```
|
||||
|
||||
@@ -87,14 +91,28 @@ backend/rag_core/
|
||||
├── vector_store.py # Qdrant 混合存储(异步)
|
||||
├── sparse_embedder.py # BM25 稀疏嵌入
|
||||
├── embedders.py # 嵌入模型封装
|
||||
├── store.py # PostgreSQL 文档存储
|
||||
├── doc_store.py # PostgreSQL 文档存储
|
||||
├── client.py # Qdrant 同步/异步客户端工厂
|
||||
└── config.py # 配置管理
|
||||
```
|
||||
|
||||
```
|
||||
backend/app/rag/
|
||||
└── retriever.py # 混合检索器(异步)
|
||||
├── retriever.py # 混合检索器(异步)
|
||||
├── rerank.py # llama.cpp 远程重排序器
|
||||
├── query_transform.py # 多路查询改写生成器
|
||||
├── fusion.py # RRF 倒数排名融合算法
|
||||
├── pipeline.py # RAG 流水线编排
|
||||
├── tools.py # LangChain Tool 封装
|
||||
├── evaluate.py # 评估工具
|
||||
└── README.md # 本文档
|
||||
```
|
||||
|
||||
```
|
||||
backend/app/model_services/
|
||||
├── embedding_services.py # 嵌入服务
|
||||
├── chat_services.py # LLM 服务
|
||||
└── rerank_services.py # 重排序服务
|
||||
```
|
||||
|
||||
## 🎯 演进路线与核心算法 (Roadmap)
|
||||
|
||||
Reference in New Issue
Block a user