This commit is contained in:
556
README.md
556
README.md
@@ -49,21 +49,20 @@
|
|||||||
- ✅ **向后兼容**:可通过 use_hybrid_router=True/False 切换混合路由/纯 React 模式
|
- ✅ **向后兼容**:可通过 use_hybrid_router=True/False 切换混合路由/纯 React 模式
|
||||||
|
|
||||||
---
|
---
|
||||||
|
好的,以下是根据我们讨论优化后的完整架构部分。你直接粘贴到 README 中即可使用。
|
||||||
|
|
||||||
|
```markdown
|
||||||
## 🏗️ 技术架构
|
## 🏗️ 技术架构
|
||||||
|
|
||||||
### 技术栈总览
|
### 1. 技术栈总览
|
||||||
|
|
||||||
| 层级 | 组件 | 技术选型 | 说明 |
|
| 层级 | 组件 | 技术选型 | 说明 |
|
||||||
|------|------|---------|------|
|
|------|------|---------|------|
|
||||||
| **Agent 框架** | 工作流编排 | LangGraph + LangChain | 状态机驱动的智能体工作流 |
|
| **Agent 框架** | 工作流编排 | LangGraph + LangChain | 状态机驱动的智能体工作流 |
|
||||||
| **主图系统** | 主流程 | main_graph/ | 混合路由 + React 循环 + 工具执行 |
|
|
||||||
| **子图系统** | 模块化子图 | subgraphs/ | 通讯录、词典、资讯分析等子图 |
|
|
||||||
| | 核心工具 | core/ | 意图理解、格式化输出、人工审核、联网搜索、可视化图表 |
|
|
||||||
| **向量数据库** | 向量检索 | Qdrant | 高性能向量相似度检索(远程服务器) |
|
|
||||||
| **后端框架** | API 服务 | FastAPI + Uvicorn | RESTful API + SSE 流式输出 |
|
| **后端框架** | API 服务 | FastAPI + Uvicorn | RESTful API + SSE 流式输出 |
|
||||||
| **前端框架** | Web 界面 | Streamlit | 交互式对话界面 |
|
| **前端框架** | Web 界面 | Streamlit | 交互式对话界面 |
|
||||||
| **关系数据库** | 持久化存储 | PostgreSQL | 对话记忆持久化(远程服务器) |
|
| **关系数据库** | 持久化存储 | PostgreSQL | 对话记忆持久化(远程服务器) |
|
||||||
|
| **向量数据库** | 向量检索 | Qdrant | 高性能向量相似度检索(远程服务器) |
|
||||||
| **容器化** | 服务编排 | Docker + Docker Compose | 一键部署所有服务 |
|
| **容器化** | 服务编排 | Docker + Docker Compose | 一键部署所有服务 |
|
||||||
| **CI/CD** | 自动化部署 | Gitea Workflows | 代码推送自动构建部署 |
|
| **CI/CD** | 自动化部署 | Gitea Workflows | 代码推送自动构建部署 |
|
||||||
| **LLM 服务** | 云端模型 | 智谱 AI (glm-4-plus) | 快速响应,适合日常对话 |
|
| **LLM 服务** | 云端模型 | 智谱 AI (glm-4-plus) | 快速响应,适合日常对话 |
|
||||||
@@ -75,267 +74,245 @@
|
|||||||
| | Rerank 服务 | rerank_services.py | 统一的重排序接口 |
|
| | Rerank 服务 | rerank_services.py | 统一的重排序接口 |
|
||||||
| **Embedding** | 向量嵌入 | llama.cpp server | 本地 embedding 服务 (:18001) |
|
| **Embedding** | 向量嵌入 | llama.cpp server | 本地 embedding 服务 (:18001) |
|
||||||
|
|
||||||
### 系统架构流程图
|
---
|
||||||
|
|
||||||
|
### 2. 系统全景图
|
||||||
|
|
||||||
|
展示系统各组件之间的高层交互关系,隐藏执行细节。
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
graph TB
|
graph TB
|
||||||
User[用户浏览器] --> Frontend[Streamlit 前端 :8501]
|
subgraph UserLayer["👤 用户层"]
|
||||||
Frontend --> Backend[FastAPI 后端 :8079]
|
Browser["浏览器"]
|
||||||
|
Streamlit["Streamlit 前端<br/>:8501"]
|
||||||
|
end
|
||||||
|
|
||||||
Backend --> AgentService[AIAgentService]
|
subgraph AppLayer["⚙️ 应用服务层"]
|
||||||
|
FastAPI["FastAPI 后端<br/>:8079"]
|
||||||
|
Agent["AIAgentService<br/>(智能体协调)"]
|
||||||
|
end
|
||||||
|
|
||||||
AgentService --> LangGraph[LangGraph 工作流引擎]
|
subgraph EngineLayer["🧠 智能体引擎"]
|
||||||
|
LangGraph["LangGraph 工作流引擎<br/>路由 / React / 工具调用"]
|
||||||
|
end
|
||||||
|
|
||||||
LangGraph --> RetrieveMemory[记忆检索 retrieve_memory]
|
subgraph ServicesLayer["🧩 领域服务"]
|
||||||
LangGraph --> MemoryTrigger[记忆触发 memory_trigger]
|
RAG["RAG 检索服务"]
|
||||||
LangGraph --> InitState[初始化状态 init_state]
|
Tools["工具集<br/>(搜索、通讯录、词典等)"]
|
||||||
LangGraph --> HybridRouter[混合路由 hybrid_router]
|
end
|
||||||
LangGraph --> ReactLoop[React 循环 react_loop]
|
|
||||||
LangGraph --> FastPath[快速路径 fast_*]
|
|
||||||
LangGraph --> LLMCall[LLM 调用 llm_call]
|
|
||||||
LangGraph --> Summarize[记忆摘要 summarize]
|
|
||||||
LangGraph --> Finalize[最终处理 finalize]
|
|
||||||
|
|
||||||
HybridRouter --> FastChitchat[fast_chitchat]
|
subgraph ModelLayer["🤖 模型层"]
|
||||||
HybridRouter --> FastRAG[fast_rag]
|
LLM["LLM 服务<br/>(多模型降级链)"]
|
||||||
HybridRouter --> FastTool[fast_tool]
|
Embedding["Embedding 服务<br/>:18001"]
|
||||||
HybridRouter --> ReactLoop
|
Rerank["Rerank 服务<br/>:18002"]
|
||||||
|
end
|
||||||
|
|
||||||
ReactLoop --> ReactReason[react_reason 推理节点]
|
subgraph DataLayer["💾 数据层"]
|
||||||
ReactLoop --> RAGRetrieve[rag_retrieve]
|
Qdrant["Qdrant 向量库"]
|
||||||
ReactLoop --> WebSearch[web_search]
|
PostgreSQL["PostgreSQL"]
|
||||||
ReactLoop --> ContactSubgraph[contact_subgraph]
|
end
|
||||||
ReactLoop --> DictionarySubgraph[dictionary_subgraph]
|
|
||||||
ReactLoop --> NewsSubgraph[news_analysis_subgraph]
|
|
||||||
ReactLoop --> HandleError[handle_error]
|
|
||||||
|
|
||||||
RAGRetrieve --> Qdrant[Qdrant向量库]
|
Browser --> Streamlit
|
||||||
RAGRetrieve --> RerankService[Rerank服务]
|
Streamlit --> FastAPI
|
||||||
RAGRetrieve --> EmbeddingService[Embedding服务]
|
FastAPI --> Agent
|
||||||
|
Agent --> LangGraph
|
||||||
|
LangGraph --> RAG
|
||||||
|
LangGraph --> Tools
|
||||||
|
LangGraph --> LLM
|
||||||
|
RAG --> Embedding
|
||||||
|
RAG --> Rerank
|
||||||
|
RAG --> Qdrant
|
||||||
|
Agent --> PostgreSQL
|
||||||
|
LangGraph --> PostgreSQL
|
||||||
|
|
||||||
AgentService --> ChatServices[模型服务层 chat_services]
|
style UserLayer fill:#e1f5ff
|
||||||
ChatServices --> FallbackChain[FallbackServiceChain]
|
style AppLayer fill:#fff4e1
|
||||||
FallbackChain --> Zhipu[智谱 GLM-4]
|
style EngineLayer fill:#ffe0b2
|
||||||
FallbackChain --> DeepSeek[DeepSeek V3]
|
style ServicesLayer fill:#e8f5e9
|
||||||
FallbackChain --> OpenAI[OpenAI GPT-4o]
|
style ModelLayer fill:#f3e5f5
|
||||||
FallbackChain --> LocalQwen[本地 Qwen3.5-9B]
|
style DataLayer fill:#ffebee
|
||||||
|
|
||||||
RetrieveMemory --> PostgreSQL[PostgreSQL]
|
|
||||||
Summarize --> PostgreSQL
|
|
||||||
|
|
||||||
style User fill:#e1f5ff
|
|
||||||
style Frontend fill:#fff4e1
|
|
||||||
style Backend fill:#e8f5e9
|
|
||||||
style HybridRouter fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
|
||||||
style ReactLoop fill:#f3e5f5
|
|
||||||
style FastPath fill:#e3f2fd
|
|
||||||
style LangGraph fill:#c8e6c9
|
|
||||||
style ChatServices fill:#c8e6c9
|
|
||||||
style PostgreSQL fill:#ffebee
|
|
||||||
style Qdrant fill:#ffebee
|
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 主图与子图架构
|
### 3. 智能体引擎(主图)
|
||||||
|
|
||||||
|
#### 3.1 核心流程
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
graph TB
|
graph TB
|
||||||
subgraph "主图 MainGraph"
|
Start([开始]) --> Retrieve["检索长期记忆"]
|
||||||
StartMain[START]
|
Retrieve --> Trigger["记忆触发"]
|
||||||
RetrieveMemory[记忆检索]
|
Trigger --> Init["初始化状态"]
|
||||||
MemoryTrigger[记忆触发]
|
Init --> Route{"混合路由<br/>hybrid_router"}
|
||||||
InitState[初始化状态]
|
|
||||||
HybridRouter[混合路由]
|
Route -->|闲聊/简单问答| Fast["⚡ 快速路径<br/>(fast_*)"]
|
||||||
FastChitchat[fast_chitchat]
|
Route -->|知识检索| FastRAG["⚡ 快速 RAG"]
|
||||||
FastRAG[fast_rag]
|
Route -->|工具调用| FastTool["⚡ 快速工具"]
|
||||||
FastTool[fast_tool]
|
Route -->|复杂任务| React["🔄 React 循环"]
|
||||||
ReactReason[react_reason]
|
|
||||||
LLMCall[llm_call]
|
React --> Reason["推理"]
|
||||||
FinalMain[最终响应]
|
Reason --> Action["选择行动"]
|
||||||
EndMain[END]
|
Action -->|需要检索| RAGNode["RAG 检索"]
|
||||||
|
Action -->|搜索| Web["联网搜索"]
|
||||||
|
Action -->|通讯录| Contact["通讯录子图"]
|
||||||
|
Action -->|词典| Dict["词典子图"]
|
||||||
|
Action -->|资讯| News["资讯子图"]
|
||||||
|
Action -->|LLM| LLMCall["LLM 生成"]
|
||||||
|
|
||||||
|
RAGNode --> Observe["观察结果"]
|
||||||
|
Web --> Observe
|
||||||
|
Contact --> Observe
|
||||||
|
Dict --> Observe
|
||||||
|
News --> Observe
|
||||||
|
LLMCall --> Observe
|
||||||
|
Observe -->|未完成| Reason
|
||||||
|
Observe -->|完成| Final["生成最终回复"]
|
||||||
|
|
||||||
|
Fast --> Final
|
||||||
|
FastRAG --> Final
|
||||||
|
FastTool --> Final
|
||||||
|
Final --> End([结束])
|
||||||
|
|
||||||
StartMain --> RetrieveMemory
|
style Route fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
||||||
RetrieveMemory --> MemoryTrigger
|
style React fill:#f3e5f5,stroke:#7e57c2,stroke-width:2px
|
||||||
MemoryTrigger --> InitState
|
|
||||||
InitState --> HybridRouter
|
|
||||||
|
|
||||||
HybridRouter --> FastChitchat
|
|
||||||
HybridRouter --> FastRAG
|
|
||||||
HybridRouter --> FastTool
|
|
||||||
HybridRouter --> ReactReason
|
|
||||||
|
|
||||||
FastChitchat --> LLMCall
|
|
||||||
FastChitchat -.-> ReactReason
|
|
||||||
FastRAG --> LLMCall
|
|
||||||
FastRAG -.-> ReactReason
|
|
||||||
FastTool --> LLMCall
|
|
||||||
FastTool -.-> ReactReason
|
|
||||||
|
|
||||||
ReactReason --> RAGRetrieve[RAG检索]
|
|
||||||
ReactReason --> WebSearchNode[联网搜索]
|
|
||||||
ReactReason --> ContactNode[通讯录子图]
|
|
||||||
ReactReason --> DictNode[词典子图]
|
|
||||||
ReactReason --> NewsNode[资讯子图]
|
|
||||||
ReactReason --> LLMCall
|
|
||||||
|
|
||||||
RAGRetrieve --> ReactReason
|
|
||||||
WebSearchNode --> ReactReason
|
|
||||||
ContactNode --> ReactReason
|
|
||||||
DictNode --> ReactReason
|
|
||||||
NewsNode --> ReactReason
|
|
||||||
|
|
||||||
LLMCall --> FinalMain
|
|
||||||
FinalMain --> EndMain
|
|
||||||
end
|
|
||||||
|
|
||||||
subgraph "通讯录子图 ContactSubgraph"
|
|
||||||
StartContact[START]
|
|
||||||
IntentContact[parse_intent]
|
|
||||||
ListContacts[list_contacts]
|
|
||||||
AddContact[add_contact]
|
|
||||||
ListEmails[list_emails]
|
|
||||||
GenEmail[generate_email_draft]
|
|
||||||
HumanReview[human_review]
|
|
||||||
SendEmail[send_email]
|
|
||||||
SniffContact[sniff_contacts]
|
|
||||||
FormatContact[format_result]
|
|
||||||
EndContact[END]
|
|
||||||
|
|
||||||
StartContact --> IntentContact
|
|
||||||
IntentContact --> ListContacts
|
|
||||||
IntentContact --> AddContact
|
|
||||||
IntentContact --> ListEmails
|
|
||||||
IntentContact --> GenEmail
|
|
||||||
IntentContact --> SniffContact
|
|
||||||
ListContacts --> FormatContact
|
|
||||||
AddContact --> FormatContact
|
|
||||||
ListEmails --> FormatContact
|
|
||||||
SniffContact --> FormatContact
|
|
||||||
GenEmail --> HumanReview
|
|
||||||
HumanReview --> SendEmail
|
|
||||||
HumanReview --> FormatContact
|
|
||||||
SendEmail --> FormatContact
|
|
||||||
FormatContact --> EndContact
|
|
||||||
end
|
|
||||||
|
|
||||||
subgraph "词典子图 DictionarySubgraph"
|
|
||||||
StartDict[START]
|
|
||||||
IntentDict[parse_intent]
|
|
||||||
QueryWord[query_word]
|
|
||||||
Translate[translate_text]
|
|
||||||
ExtractTerms[extract_terms]
|
|
||||||
DailyWord[get_daily_word]
|
|
||||||
LookupWord[lookup_word_book]
|
|
||||||
AddToWord[add_to_word_book]
|
|
||||||
FormatDict[format_result]
|
|
||||||
EndDict[END]
|
|
||||||
|
|
||||||
StartDict --> IntentDict
|
|
||||||
IntentDict --> QueryWord
|
|
||||||
IntentDict --> Translate
|
|
||||||
IntentDict --> ExtractTerms
|
|
||||||
IntentDict --> DailyWord
|
|
||||||
IntentDict --> LookupWord
|
|
||||||
IntentDict --> AddToWord
|
|
||||||
QueryWord --> FormatDict
|
|
||||||
Translate --> FormatDict
|
|
||||||
ExtractTerms --> FormatDict
|
|
||||||
DailyWord --> FormatDict
|
|
||||||
LookupWord --> FormatDict
|
|
||||||
AddToWord --> FormatDict
|
|
||||||
FormatDict --> EndDict
|
|
||||||
end
|
|
||||||
|
|
||||||
subgraph "资讯分析子图 NewsSubgraph"
|
|
||||||
StartNews[START]
|
|
||||||
IntentNews[parse_intent]
|
|
||||||
QueryNews[query_news]
|
|
||||||
AnalyzeUrl[analyze_url]
|
|
||||||
ExtractKeywords[extract_keywords]
|
|
||||||
GenReport[generate_report]
|
|
||||||
FormatNews[format_result]
|
|
||||||
EndNews[END]
|
|
||||||
|
|
||||||
StartNews --> IntentNews
|
|
||||||
IntentNews --> QueryNews
|
|
||||||
IntentNews --> AnalyzeUrl
|
|
||||||
IntentNews --> ExtractKeywords
|
|
||||||
IntentNews --> GenReport
|
|
||||||
QueryNews --> FormatNews
|
|
||||||
AnalyzeUrl --> FormatNews
|
|
||||||
ExtractKeywords --> FormatNews
|
|
||||||
GenReport --> FormatNews
|
|
||||||
FormatNews --> EndNews
|
|
||||||
end
|
|
||||||
|
|
||||||
ReactReason -.-> StartContact
|
|
||||||
ReactReason -.-> StartDict
|
|
||||||
ReactReason -.-> StartNews
|
|
||||||
|
|
||||||
style HybridRouter fill:#fff3e0,stroke:#ff9800,stroke-width:3px
|
|
||||||
style ReactReason fill:#e8eaf6
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### 3.2 路由策略
|
||||||
|
|
||||||
|
- **闲聊 / 简单问答** → `fast_chitchat`:直接调用 LLM 生成回复
|
||||||
|
- **知识检索** → `fast_rag`:RAG 检索后生成回复
|
||||||
|
- **工具调用** → `fast_tool`:直接执行原子工具操作
|
||||||
|
- **复杂任务** → `React 循环`:推理 → 行动 → 观察的多轮迭代
|
||||||
|
|
||||||
|
#### 3.3 React 循环详解
|
||||||
|
|
||||||
|
当任务需要多步推理时,引擎进入 React 循环:
|
||||||
|
|
||||||
|
1. **推理**:LLM 分析当前状态,决定下一步行动
|
||||||
|
2. **行动**:执行工具调用(RAG 检索、联网搜索、子图操作等)
|
||||||
|
3. **观察**:收集行动结果,更新状态
|
||||||
|
4. 若任务未完成,回到步骤 1;否则退出循环,生成最终回复
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
### 索引工作流(离线构建)
|
### 4. 子图系统
|
||||||
|
|
||||||
|
每个子图都是独立的模块化工作流,由主图的 React 循环按需调用。
|
||||||
|
|
||||||
|
#### 4.1 通讯录子图
|
||||||
|
|
||||||
|
支持联系人管理、邮件草稿生成与人工审核发送。
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
Start([START]) --> Intent["parse_intent"]
|
||||||
|
Intent --> List["list_contacts"]
|
||||||
|
Intent --> Add["add_contact"]
|
||||||
|
Intent --> EmailList["list_emails"]
|
||||||
|
Intent --> GenEmail["generate_email_draft"]
|
||||||
|
Intent --> Sniff["sniff_contacts"]
|
||||||
|
GenEmail --> Human["human_review"]
|
||||||
|
Human --> Send["send_email"]
|
||||||
|
Send --> Format["format_result"]
|
||||||
|
List --> Format
|
||||||
|
Add --> Format
|
||||||
|
EmailList --> Format
|
||||||
|
Sniff --> Format
|
||||||
|
Format --> End([END])
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4.2 词典子图
|
||||||
|
|
||||||
|
支持单词查询、翻译、术语提取、每日单词、单词本管理。
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
Start([START]) --> Intent["parse_intent"]
|
||||||
|
Intent --> Query["query_word"]
|
||||||
|
Intent --> Trans["translate_text"]
|
||||||
|
Intent --> Terms["extract_terms"]
|
||||||
|
Intent --> Daily["get_daily_word"]
|
||||||
|
Intent --> Lookup["lookup_word_book"]
|
||||||
|
Intent --> AddWord["add_to_word_book"]
|
||||||
|
Query --> Format["format_result"]
|
||||||
|
Trans --> Format
|
||||||
|
Terms --> Format
|
||||||
|
Daily --> Format
|
||||||
|
Lookup --> Format
|
||||||
|
AddWord --> Format
|
||||||
|
Format --> End([END])
|
||||||
|
```
|
||||||
|
|
||||||
|
#### 4.3 资讯分析子图
|
||||||
|
|
||||||
|
支持新闻检索、URL 分析、关键词提取与报告生成。
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
Start([START]) --> Intent["parse_intent"]
|
||||||
|
Intent --> QueryNews["query_news"]
|
||||||
|
Intent --> AnalyzeUrl["analyze_url"]
|
||||||
|
Intent --> Keywords["extract_keywords"]
|
||||||
|
Intent --> Report["generate_report"]
|
||||||
|
QueryNews --> Format["format_result"]
|
||||||
|
AnalyzeUrl --> Format
|
||||||
|
Keywords --> Format
|
||||||
|
Report --> Format
|
||||||
|
Format --> End([END])
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 5. RAG 系统
|
||||||
|
|
||||||
|
#### 5.1 离线索引
|
||||||
|
|
||||||
|
文档导入、切分、嵌入生成到存入向量库的完整流程。
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart TB
|
flowchart TB
|
||||||
subgraph 文档输入
|
subgraph Input["文档输入"]
|
||||||
A1[文档源]
|
DocSource["文档源<br/>PDF/DOCX/TXT/Markdown"]
|
||||||
A2[PDF/DOCX/TXT/Markdown]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph 文档加载
|
subgraph Load["文档加载"]
|
||||||
B1[rag_indexer/loaders.py]
|
Loader["Unstructured / PyMuPDF / TextLoader"]
|
||||||
B2[UnstructuredLoader]
|
|
||||||
B3[PyMuPDFLoader]
|
|
||||||
B4[TextLoader]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph 文本切分
|
subgraph Split["文本切分"]
|
||||||
C1[rag_indexer/splitters.py]
|
Recursive["RecursiveCharacterTextSplitter<br/>按分隔符递归切分"]
|
||||||
C2[RecursiveCharacterTextSplitter<br/>按分隔符递归切分]
|
Semantic["SemanticChunker<br/>基于语义相似度"]
|
||||||
C3[SemanticChunker<br/>基于语义相似度]
|
ParentChild["ParentChildSplitter<br/>父子块切分"]
|
||||||
C4[ParentChildSplitter<br/>父子块切分]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph 嵌入生成
|
subgraph Embed["嵌入生成"]
|
||||||
D1[Embedding 生成]
|
Dense["稠密向量<br/>Qwen3-Embedding-0.6B"]
|
||||||
D2[稠密向量<br/>Qwen3-Embedding-0.6B<br/>llama.cpp server:18001]
|
Sparse["稀疏向量 BM25<br/>FastEmbed"]
|
||||||
D3[稀疏向量 BM25<br/>FastEmbed]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph 向量存储
|
subgraph Store["向量存储"]
|
||||||
E1[Qdrant Vector Store]
|
QdrantStore["Qdrant<br/>HNSW 索引 + 稀疏索引"]
|
||||||
E2[稠密向量索引<br/>HNSW 算法]
|
|
||||||
E3[稀疏向量索引<br/>BM25]
|
|
||||||
E4[rag_core/vector_store.py]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
A1 --> A2
|
DocSource --> Loader
|
||||||
A2 --> B1
|
Loader --> Recursive
|
||||||
B1 --> B2
|
Loader --> Semantic
|
||||||
B2 --> C1
|
Loader --> ParentChild
|
||||||
C1 --> C2
|
Recursive --> Dense
|
||||||
C1 --> C3
|
Semantic --> Dense
|
||||||
C1 --> C4
|
ParentChild --> Dense
|
||||||
C2 --> D1
|
Recursive --> Sparse
|
||||||
C3 --> D1
|
Semantic --> Sparse
|
||||||
C4 --> D1
|
ParentChild --> Sparse
|
||||||
D1 --> D2
|
Dense --> QdrantStore
|
||||||
D1 --> D3
|
Sparse --> QdrantStore
|
||||||
D2 --> E1
|
|
||||||
D3 --> E1
|
|
||||||
E1 --> E2
|
|
||||||
E1 --> E3
|
|
||||||
|
|
||||||
style A1 fill:#e3f2fd
|
style Input fill:#e3f2fd
|
||||||
style B1 fill:#fff3e0
|
style Load fill:#fff3e0
|
||||||
style C1 fill:#f3e5f5
|
style Split fill:#f3e5f5
|
||||||
style D1 fill:#e8f5e9
|
style Embed fill:#e8f5e9
|
||||||
style E1 fill:#ffebee
|
style Store fill:#ffebee
|
||||||
```
|
```
|
||||||
|
|
||||||
**技术组件说明:**
|
**技术组件说明:**
|
||||||
@@ -343,79 +320,110 @@ flowchart TB
|
|||||||
| 组件 | 技术选型 | 说明 |
|
| 组件 | 技术选型 | 说明 |
|
||||||
|------|---------|------|
|
|------|---------|------|
|
||||||
| 文档加载 | Unstructured / PyMuPDF / TextLoader | 支持多种文档格式 |
|
| 文档加载 | Unstructured / PyMuPDF / TextLoader | 支持多种文档格式 |
|
||||||
| 文本切分 | RecursiveCharacterTextSplitter | 按分隔符递归切分,默认 500 字符 |
|
| 文本切分 | RecursiveCharacterTextSplitter | 默认 500 字符,按分隔符递归切分 |
|
||||||
| 语义切分 | SemanticChunker | 基于 Embedding 相似度自动切分 |
|
| 语义切分 | SemanticChunker | 基于 Embedding 相似度自动切分 |
|
||||||
| 父子块切分 | ParentChildSplitter | 大块存储上下文,小块用于检索 |
|
| 父子切分 | ParentChildSplitter | 大块存储上下文,小块用于检索 |
|
||||||
| 稠密嵌入 | Qwen3-Embedding-0.6B-Q8_0 | llama.cpp server (:18001) |
|
| 稠密嵌入 | Qwen3-Embedding-0.6B-Q8_0 | llama.cpp server (:18001) |
|
||||||
| 稀疏嵌入 | FastEmbed BM25 | 本地计算,无需额外服务 |
|
| 稀疏嵌入 | FastEmbed BM25 | 本地计算,无需额外服务 |
|
||||||
| 向量存储 | Qdrant | HNSW 索引,高性能 ANN 检索 |
|
| 向量存储 | Qdrant | HNSW 索引,高性能 ANN 检索 |
|
||||||
|
|
||||||
### 检索工作流(在线查询)
|
#### 5.2 在线检索
|
||||||
|
|
||||||
|
用户查询经过改写、混合检索、融合、重排序,最终由 LLM 生成回答。
|
||||||
|
|
||||||
```mermaid
|
```mermaid
|
||||||
flowchart TB
|
flowchart TB
|
||||||
subgraph "查询输入"
|
subgraph Input["查询输入"]
|
||||||
Q1[用户查询]
|
Query["用户查询"]
|
||||||
Q2[公司报销流程]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph "查询处理"
|
subgraph Processing["查询处理"]
|
||||||
R1[查询改写]
|
Rewrite["查询改写<br/>(LLM 生成多角度查询)"]
|
||||||
R2[使用 chat_services]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph "混合检索"
|
subgraph Retrieval["混合检索"]
|
||||||
S1[并行检索]
|
Parallel["并行检索"]
|
||||||
S2[稠密向量检索]
|
DenseRet["稠密向量检索"]
|
||||||
S3[稀疏BM25检索]
|
SparseRet["稀疏 BM25 检索"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph "结果融合"
|
subgraph Fusion["结果融合"]
|
||||||
F1[RRF融合]
|
RRF["RRF 融合<br/>(Qdrant 服务端融合)"]
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph "重排序"
|
subgraph Rerank["重排序"]
|
||||||
P1[Cross-Encoder重排]
|
CrossEncoder["Cross-Encoder 重排<br/>bge-reranker-v2-m3"]
|
||||||
P2[18002端口]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
subgraph "LLM生成"
|
subgraph Generation["生成"]
|
||||||
G1[LLM生成回答]
|
LLMGen["LLM 生成回答"]
|
||||||
G2[chat_services]
|
|
||||||
end
|
end
|
||||||
|
|
||||||
Q1 --> Q2
|
Query --> Rewrite --> Parallel
|
||||||
Q2 --> R1
|
Parallel --> DenseRet
|
||||||
R1 --> R2
|
Parallel --> SparseRet
|
||||||
R2 --> S1
|
DenseRet --> RRF
|
||||||
S1 --> S2
|
SparseRet --> RRF
|
||||||
S1 --> S3
|
RRF --> CrossEncoder --> LLMGen
|
||||||
S2 --> F1
|
|
||||||
S3 --> F1
|
|
||||||
F1 --> P1
|
|
||||||
P1 --> P2
|
|
||||||
P2 --> G1
|
|
||||||
G1 --> G2
|
|
||||||
|
|
||||||
style Q1 fill:#e3f2fd
|
style Input fill:#e3f2fd
|
||||||
style R1 fill:#fff3e0
|
style Processing fill:#fff3e0
|
||||||
style S1 fill:#f3e5f5
|
style Retrieval fill:#f3e5f5
|
||||||
style F1 fill:#e8f5e9
|
style Fusion fill:#e8f5e9
|
||||||
style P1 fill:#ffebee
|
style Rerank fill:#ffebee
|
||||||
style G1 fill:#fff3e0
|
style Generation fill:#fff3e0
|
||||||
```
|
```
|
||||||
|
|
||||||
**技术组件说明:**
|
**技术组件说明:**
|
||||||
|
|
||||||
| 阶段 | 技术选型 | 说明 |
|
| 阶段 | 技术选型 | 说明 |
|
||||||
|------|---------|------|
|
|------|---------|------|
|
||||||
| 查询改写 | MultiQuery | 使用 LLM 生成 3-5 个多角度查询 |
|
| 查询改写 | MultiQuery | LLM 生成 3~5 个多角度查询 |
|
||||||
| 稠密检索 | Qwen3-Embedding | 向量相似度计算,余弦相似度 |
|
| 稠密检索 | Qwen3-Embedding | 余弦相似度向量检索 |
|
||||||
| 稀疏检索 | FastEmbed BM25 | 词频 TF-IDF 统计 |
|
| 稀疏检索 | FastEmbed BM25 | TF-IDF 词频统计检索 |
|
||||||
| 结果融合 | Qdrant Fusion API | 服务端 RRF 融合,无需传输数据 |
|
| 结果融合 | Qdrant Fusion API | 服务端 RRF 融合,减少数据传输 |
|
||||||
| 重排序 | bge-reranker-v2-m3 | Cross-Encoder 交互编码,精度更高 |
|
| 重排序 | bge-reranker-v2-m3 | Cross-Encoder 交互编码,精度更高 |
|
||||||
| LLM 生成 | chat_services | 统一的大模型服务接口 |
|
| LLM 生成 | chat_services | 统一的大模型服务接口 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 6. 模型服务层
|
||||||
|
|
||||||
|
#### 6.1 多模型降级链
|
||||||
|
|
||||||
|
当首选模型调用失败时,自动依次尝试备用模型,保证服务可用性。
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph LR
|
||||||
|
Start([API 请求]) --> Zhipu["智谱 GLM-4"]
|
||||||
|
Zhipu -->|失败| DeepSeek["DeepSeek V3"]
|
||||||
|
DeepSeek -->|失败| OpenAI["OpenAI GPT-4o"]
|
||||||
|
OpenAI -->|失败| Local["本地 Qwen3.5-9B"]
|
||||||
|
Local --> Response([返回响应])
|
||||||
|
style Start fill:#e1f5ff
|
||||||
|
style Response fill:#c8e6c9
|
||||||
|
```
|
||||||
|
|
||||||
|
**降级策略:** 云端模型按优先级依次尝试,最终回退到本地模型,确保无单点故障。
|
||||||
|
|
||||||
|
#### 6.2 统一服务接口
|
||||||
|
|
||||||
|
所有模型调用均通过以下三个统一接口访问,上层业务不感知具体模型:
|
||||||
|
|
||||||
|
- **`chat_services`**:对话生成
|
||||||
|
- **`embedding_services`**:文本向量化
|
||||||
|
- **`rerank_services`**:搜索结果重排序
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 7. 数据存储
|
||||||
|
|
||||||
|
| 存储 | 用途 | 访问方式 |
|
||||||
|
|------|------|---------|
|
||||||
|
| **PostgreSQL** | 对话历史、长期记忆 | 远程服务器,SQLAlchemy ORM |
|
||||||
|
| **Qdrant** | 文档向量、知识库 | 远程服务器,gRPC/HTTP API |
|
||||||
|
```
|
||||||
|
|
||||||
### 数据流向图
|
### 数据流向图
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -78,6 +78,10 @@ rag_indexer/
|
|||||||
├── index_builder.py # 索引构建主流水线(自定义父子块实现)
|
├── index_builder.py # 索引构建主流水线(自定义父子块实现)
|
||||||
├── loaders.py # 文档加载器(多格式支持)
|
├── loaders.py # 文档加载器(多格式支持)
|
||||||
├── splitters.py # 文本切分器(递归/语义/父子块)
|
├── splitters.py # 文本切分器(递归/语义/父子块)
|
||||||
|
├── config.py # 配置管理
|
||||||
|
├── cli.py # 命令行接口
|
||||||
|
├── clear_qdrant.py # 清空 Qdrant 集合
|
||||||
|
├── reset_qdrant.py # 重置 Qdrant 集合
|
||||||
└── README.md # 本文档
|
└── README.md # 本文档
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -87,14 +91,28 @@ backend/rag_core/
|
|||||||
├── vector_store.py # Qdrant 混合存储(异步)
|
├── vector_store.py # Qdrant 混合存储(异步)
|
||||||
├── sparse_embedder.py # BM25 稀疏嵌入
|
├── sparse_embedder.py # BM25 稀疏嵌入
|
||||||
├── embedders.py # 嵌入模型封装
|
├── embedders.py # 嵌入模型封装
|
||||||
├── store.py # PostgreSQL 文档存储
|
├── doc_store.py # PostgreSQL 文档存储
|
||||||
├── client.py # Qdrant 同步/异步客户端工厂
|
├── client.py # Qdrant 同步/异步客户端工厂
|
||||||
└── config.py # 配置管理
|
└── config.py # 配置管理
|
||||||
```
|
```
|
||||||
|
|
||||||
```
|
```
|
||||||
backend/app/rag/
|
backend/app/rag/
|
||||||
└── retriever.py # 混合检索器(异步)
|
├── retriever.py # 混合检索器(异步)
|
||||||
|
├── rerank.py # llama.cpp 远程重排序器
|
||||||
|
├── query_transform.py # 多路查询改写生成器
|
||||||
|
├── fusion.py # RRF 倒数排名融合算法
|
||||||
|
├── pipeline.py # RAG 流水线编排
|
||||||
|
├── tools.py # LangChain Tool 封装
|
||||||
|
├── evaluate.py # 评估工具
|
||||||
|
└── README.md # 本文档
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
backend/app/model_services/
|
||||||
|
├── embedding_services.py # 嵌入服务
|
||||||
|
├── chat_services.py # LLM 服务
|
||||||
|
└── rerank_services.py # 重排序服务
|
||||||
```
|
```
|
||||||
|
|
||||||
## 🎯 演进路线与核心算法 (Roadmap)
|
## 🎯 演进路线与核心算法 (Roadmap)
|
||||||
|
|||||||
Reference in New Issue
Block a user