📚 整理并更新文档

- 删除过时文档：REACT_PLAN.md、backend/docs/HYBRID_ROUTER.md - 更新 REACT_MODE_SUMMARY.md：加入新的混合路由架构 - 更新 README.md：加入混合路由、双模型服务等新特性 - 更新 backend/app/README.md：加入 hybrid_router.py - 更新 backend/app/model_services/README.md：加入 get_chat_service/get_small_llm_service - 更新 .gitignore：允许 REACT_MODE_SUMMARY.md 上传 - 新增 backend/test/test_hybrid_router.py：测试脚本
2026-05-03 16:53:34 +08:00
parent a5fc9cd5d8
commit 53fbfb4741
6 changed files with 218 additions and 278 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -26,6 +26,7 @@
 !.gitignore
 !README.md
 !QUICKSTART.md
 !REACT_MODE_SUMMARY.md
 !LICENSE
 !requirement.txt
 !.env.docker
--- a/REACT_MODE_SUMMARY.md
+++ b/REACT_MODE_SUMMARY.md
@@ -0,0 +1,149 @@
 # React 模式架构总结
 ---
 ## ✅ 当前架构：混合路由 + React 循环
 本项目采用 **两层混合架构**：
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │ 第一层：前置混合路由（低延迟）                                 │
 │   ├─ 规则快速分流（无 LLM）                                  │
 │   ├─ 轻量级意图分类（smallLLM）                              │
 │   └─ 快速路径（fast_chitchat, fast_rag, fast_tool）         │
 └───────────────────────┬─────────────────────────────────────┘
                        ↓（自动升级：失败时）
 ┌─────────────────────────────────────────────────────────────┐
 │ 第二层：完整 React 循环（兜底，复杂任务处理）                  │
 │   └─ 推理 → 行动 → 观察（最多 40 步）                        │
 └─────────────────────────────────────────────────────────────┘
 ```
 ---
 ## 🎯 第一层：前置混合路由（新）
 ### 核心功能
 | 功能 | 说明 |
 |------|------|
 | 规则快速分流 | 无 LLM，毫秒级响应，用于问候、感谢、子图关键词等 |
 | 轻量级意图分类 | 使用 smallLLM，压缩到 4 类：chitchat, knowledge, tool, complex |
 | 快速路径 | 三个快速处理节点：fast_chitchat, fast_rag, fast_tool |
 | 自动升级 | 快速路径失败时，自动回到完整 React 循环 |
 | SSE 事件增强 | intent_classified, path_decision, fast_path_*, escalation |
 ### 快速流程图
 ```
 START
  ↓
 init_state
  ↓
 hybrid_router (前置路由) ←────────────┐
  ↓                                    │
  ├─ 规则分流 → fast_chitchat →────────┤
  │                     ↓              │
  ├─ 模型分类 → fast_rag →────────────┤
  │                     ↓              │
  ├─               fast_tool →────────┤
  │                     ↓              │
  └─               react_loop →────────┤
                        ↓              │
              检查成功/升级? ──────────┘
                 ↓         ↓
              finalize  react_reason
 ```
 ### 关键文件
 | 文件 | 说明 |
 |------|------|
 | `backend/app/main_graph/nodes/hybrid_router.py` | 混合路由完整实现 |
 | `backend/app/model_services/chat_services.py` | get_chat_service() + get_small_llm_service() |
 | `backend/app/main_graph/utils/main_graph_builder.py` | 集成混合路由到主图 |
 ### 配置项
 ```python
 # 构建图时可选择
 graph = build_react_main_graph(use_hybrid_router=True)  # 启用混合路由（默认）
 graph = build_react_main_graph(use_hybrid_router=False) # 禁用，纯 React 循环
 ```
 ---
 ## 🎯 第二层：完整 React 循环（保留）
 ### 核心特性
 | 特性 | 说明 |
 |------|------|
 | 循环推理 | 每轮推理判断下一步，最多 40 步 |
 | 结构化错误 | ErrorRecord + ErrorSeverity |
 | 超时重试 | RAG 最多 2 次，子图最多 1 次 |
 | 子图集成 | contact, dictionary, news_analysis |
 | RAG 检索 | 支持重检索（re_retrieve） |
 ### 流程图
 ```
 react_reason (推理) ←──────────────────┐
  ↓                                    │
 条件路由                              │
  ├─→ rag_retrieve (带重试) →──────────┤
  ├─→ contact_subgraph →───────────────┤
  ├─→ dictionary_subgraph →────────────┤
  ├─→ news_analysis_subgraph →─────────┤
  ├─→ handle_error → (重试或降级) →────┤
  └─→ finalize
  ↓
 END
 ```
 ---
 ## 📁 关键文件清单
 | 文件 | 说明 |
 |------|------|
 | `backend/app/main_graph/utils/main_graph_builder.py` | 主图构建（支持混合路由开关） |
 | `backend/app/main_graph/nodes/react_nodes.py` | React 循环节点 |
 | `backend/app/main_graph/nodes/hybrid_router.py` | 混合路由节点（新） |
 | `backend/app/main_graph/nodes/rag_nodes.py` | RAG 检索节点 |
 | `backend/app/main_graph/utils/retry_utils.py` | 超时重试工具 |
 | `backend/app/main_graph/state.py` | 主状态 |
 | `backend/app/core/intent.py` | React 模式意图推理器 |
 | `backend/app/model_services/chat_services.py` | 双模型服务（llm + smallLLM） |
 ---
 ## 🚀 快速使用
 ```python
 from backend.app.main_graph.utils.main_graph_builder import build_react_main_graph
 # 构建图（默认启用混合路由）
 graph = build_react_main_graph(use_hybrid_router=True)
 compiled_graph = graph.compile()
 # 调用
 result = compiled_graph.invoke({"user_query": "你好", "user_id": "test"})
 print(result.final_result)
 ```
 ---
 ## 🎉 完整特性总结
 ✅ 双模型服务 (llm + smallLLM)  
 ✅ 前置混合路由（规则快速分流 + 轻量级意图分类）  
 ✅ 三个快速路径（fast_chitchat, fast_rag, fast_tool）  
 ✅ 自动升级机制（快速路径失败 → 完整 React 循环）  
 ✅ SSE 事件增强（intent_classified, path_decision, fast_path_*, escalation）  
 ✅ 完整 React 循环（最多 40 步）  
 ✅ 结构化错误处理  
 ✅ 超时和重试策略  
 ✅ 子图集成（contact, dictionary, news_analysis）  
 ✅ 向后兼容（use_hybrid_router=True/False）
--- a/README.md
+++ b/README.md
@@ -45,6 +45,10 @@
 - ✅ **子图系统**：模块化的子图架构，共享公共工具（意图理解、人工审核、格式化输出）
 - ✅ **公共工具库**：联网搜索、可视化图表等通用工具，所有子图和主图均可使用
 - ✅ **React 模式** ⭐：Reasoning → Acting → Observing 循环，LLM 先思考再行动，支持多次工具调用
 - ✅ **混合路由架构** ⭐⭐：前置快速路由（规则分流 + 轻量级意图分类）+ 完整 React 循环（兜底）
 - ✅ **双模型服务** ⭐：get_chat_service()（大模型）+ get_small_llm_service()（轻量级模型）
 - ✅ **自动升级机制**：快速路径失败时，自动回到完整 React 循环
 - ✅ **向后兼容**：可通过 use_hybrid_router=True/False 切换混合路由/纯 React 模式
 ---
--- a/backend/app/README.md
+++ b/backend/app/README.md
@@ -36,6 +36,7 @@ app/
 │   ├── nodes/               # 主图节点
 │   │   ├── __init__.py
 │   │   ├── react_nodes.py   # React 模式节点（推理、路由）
 │   │   ├── hybrid_router.py # ⭐ 混合路由节点（前置快速路由 + 自动升级）
 │   │   ├── llm_call.py      # LLM 调用节点
 │   │   ├── retrieve_memory.py # 记忆检索节点
 │   │   ├── memory_trigger.py # 记忆触发节点
@@ -52,7 +53,7 @@ app/
 │   │
 │   └── utils/               # 主图工具函数
 │       ├── __init__.py
-│       ├── main_graph_builder.py # 主图构建器
+│       ├── main_graph_builder.py # 主图构建器（支持混合路由开关）
 │       ├── retry_utils.py   # 重试工具
 │       ├── rag_initializer.py # RAG 初始化工具
 │       └── visualize_graph.py # 图可视化工具
--- a/backend/app/model_services/README.md
+++ b/backend/app/model_services/README.md
@@ -1,31 +1,85 @@
 """
 模型服务模块（model_services）
-提供统一的嵌入和重排模型服务获取接口，支持自动降级：
+提供统一的嵌入、重排和生成式大模型服务获取接口，支持自动降级。
 1. 优先使用本地 llama.cpp 服务
 2. 本地服务不可用时，自动降级到智谱 API 服务
-使用方法：
+---
-from app.model_services import get_embedding_service, get_rerank_service, BaseReranker
+## 📚 生成式大模型服务（Chat）
 ### 双模型服务
 | 函数 | 说明 |
 |------|------|
 | `get_chat_service()` | 获取大模型服务（用于复杂推理、生成） |
 | `get_small_llm_service()` | 获取轻量级模型服务（用于简单意图分类、快速问答） |
 | `get_all_chat_services()` | 获取所有可用的生成式大模型服务（用于多模型切换） |
 ### 使用方法
 ```python
 from app.model_services import get_chat_service, get_small_llm_service
 # 获取大模型服务（复杂任务）
 llm = get_chat_service()
 response = llm.invoke("什么是 LangGraph?")
 # 获取轻量级模型服务（简单任务）
 small_llm = get_small_llm_service()
 response = small_llm.invoke("分类用户意图：'你好'")
 ```
 ---
 ## 📚 嵌入模型服务（Embedding）
 | 函数 | 说明 |
 |------|------|
 | `get_embedding_service()` | 获取嵌入模型服务（自动降级） |
 ### 使用方法
 ```python
 from app.model_services import get_embedding_service
 # 获取嵌入服务（LangChain 兼容的 Embeddings）
 embeddings = get_embedding_service()
 ```
 ---
 ## 📚 重排模型服务（Rerank）
 | 函数 | 说明 |
 |------|------|
 | `get_rerank_service()` | 获取重排模型服务（自动降级） |
 ### 使用方法
 ```python
 from app.model_services import get_rerank_service
 # 获取重排服务
 reranker = get_rerank_service()
 sorted_docs = reranker.compress_documents(documents, query, top_n=5)
 ```
-环境变量配置：
+---
 ## 🔧 环境变量配置
 ```env
 # 智谱 API 配置
-ZHIPUAI_API_KEY=your_api_key
+ZHIPUAI_API_KEY=***
 ZHIPU_EMBEDDING_MODEL=embedding-3  # 可选：embedding-2、embedding-3
 ZHIPU_RERANK_MODEL=rerank-2        # 可选：rerank-1、rerank-2
 ZHIPU_API_BASE=https://open.bigmodel.cn/api/paas/v4
 # DeepSeek API 配置（用于大模型）
 DEEPSEEK_API_KEY=***
 # 本地 llama.cpp 服务配置（原有配置保持不变）
 LLAMACPP_EMBEDDING_URL=http://localhost:port/v1
 LLAMACPP_RERANKER_URL=http://localhost:port/v1
-LLAMACPP_API_KEY=your_api_key
+LLAMACPP_API_KEY=***
 ```
 """
--- a/backend/docs/HYBRID_ROUTER.md
+++ b/backend/docs/HYBRID_ROUTER.md
@@ -1,269 +0,0 @@
 # 混合 Agent 路由架构文档
 ## 架构概述
 ```
                    +-----------------+
                    |   用户输入      |
                    +--------+--------+
                             |
                             v
                    +-----------------+
                    |  意图分类器     |
                    +--------+--------+
                             |
            +----------------+-----------------+
            |                |                 |
            v                v                 v
       +---------+     +---------+      +----------------+
       | 知识查询 |     | 工具操作 |      | 复杂任务       |
       +----+----+     +----+----+      +-------+--------+
            |                |                  |
            v                v                  v
      +-----------+   +------------+    +---------------+
      |  快速 RAG |   |  快速工具 |    |  React 循环  |
      +-----+-----+   +-----+------+    +-------+-------+
            |                |                 |
            +----------------+-----------------+
                             |
                             v
                      +-------------+
                      |  最终答案  |
                      +-------------+
 ```
 ## 意图类型
 | 类型 | 说明 | 示例 | 路径 |
 |------|------|------|------|
 | `knowledge` | 知识查询 | "公司报销政策是什么？" | 快速 RAG |
 | `realtime` | 实时数据查询 | "查一下订单 123 的状态" | 快速工具 |
 | `action` | 执行操作 | "帮我申请退款" | 快速工具 |
 | `chitchat` | 闲聊 | "你好" | 直接回答 |
 | `clarify` | 需要澄清 | "我想查点东西..." | 澄清反问 |
 | `mixed` | 复杂任务 | "查订单+退款政策+写邮件" | React 循环 |
 ## 路由规则
 ```
 置信度 < 0.6 → React 循环（安全模式）
 置信度 >= 0.6
    ├─ knowledge → 快速 RAG
    ├─ realtime → 快速工具
    ├─ action → 快速工具
    ├─ chitchat → 直接回答
    ├─ clarify → 澄清反问
    └─ mixed → React 循环
 ```
 ## 文件结构
 ```
 backend/app/agent/
 ├── intent_classifier.py    # 意图分类器
 ├── hybrid_router.py        # 混合路由实现
 └── service.py              # Agent 服务（已更新）
 ```
 ## SSE 事件
 ### 新增事件
 | 事件 | 说明 | 数据结构 |
 |------|------|---------|
 | `intent_classified` | 意图分类完成 | `{type: "intent_classified", intent: string, confidence: float, reasoning: string}` |
 | `path_decision` | 路径决策完成 | `{type: "path_decision", path: "fast|react_loop", intent: string}` |
 ### 完整事件流
 ```
 用户消息
  ↓
 intent_classified  (新!)
  ↓
 path_decision      (新!)
  ↓
 [node_start] llm_call
  ↓
 [reasoning] 思考过程
  ↓
 [tool_call_start] 工具调用开始
  ↓
 [tool_call_end] 工具调用结束
  ↓
 [llm_token] 最终回答
  ↓
 [human_review_request] 人工审核（如有）
  ↓
 [done]
 ```
 ## 使用示例
 ### 快速路径示例
 ```python
 # 输入
 用户: "你好"
 # 响应
 intent_classified: {
  intent: "chitchat",
  confidence: 0.95,
  reasoning: "简单寒暄"
 }
 path_decision: {
  path: "fast",
  intent: "chitchat"
 }
 llm_token: "你"...
 llm_token: "好"...
 ```
 ### React 循环示例
 ```python
 # 输入
 用户: "帮我查订单，然后生成邮件"
 # 响应
 intent_classified: {
  intent: "mixed",
  confidence: 0.92,
  reasoning: "需要查询订单、生成邮件，多步骤任务"
 }
 path_decision: {
  path: "react_loop",
  intent: "mixed"
 }
 node_start: llm_call
 reasoning: "我需要先查询订单..."
 tool_call_start: get_order
 tool_call_end: 结果
 ...
 ```
 ## 快速开始
 ### 1. 初始化意图分类器
 ```python
 from app.agent.intent_classifier import get_intent_classifier
 classifier = get_intent_classifier()
 # 分类意图
 result = await classifier.classify("公司报销政策是什么？")
 print(f"意图: {result.intent_type}")
 print(f"置信度: {result.confidence}")
 print(f"推理: {result.reasoning}")
 ```
 ### 2. 使用混合路由
 ```python
 from app.agent.hybrid_router import HybridRouter
 from app.agent.intent_classifier import get_intent_classifier
 classifier = get_intent_classifier()
 router = HybridRouter(
    intent_classifier=classifier,
    rag_pipeline=None,  # 传入 RAG
    tool_registry={},   # 传入工具
    react_graph=None    # 传入 Graph
 )
 # 路由决策
 decision = await router.route("你好")
 print(f"决策: {decision.action}")
 # 执行
 result = await router.execute(decision, "你好", "thread_123")
 ```
 ## 配置选项
 ### 置信度阈值
 ```python
 # 修改 backend/app/agent/hybrid_router.py 中的 _make_decision 方法
 if confidence < 0.6:  # 修改这个值
    # 走 React 循环
 ```
 ### 添加新的意图类型
 1. 在 `IntentType` 枚举中添加新类型
 2. 在 `routing_map` 中添加路由规则
 3. 在 `_build_examples` 中添加示例
 ## 核心优势
 1. **性能优化** - 简单问题走快速路径
 2. **用户体验** - 响应速度快
 3. **灵活扩展** - 易于添加新意图
 4. **安全可靠** - 低置信度走完整循环
 5. **可观测性** - 前端显示路径决策
 ## 测试建议
 ### 测试用例
 ```python
 test_cases = [
    # 知识查询
    ("公司报销政策是什么？", "knowledge"),
    # 实时查询
    ("查一下订单 123 的状态", "realtime"),
    # 执行操作
    ("帮我申请退款", "action"),
    # 闲聊
    ("你好", "chitchat"),
    # 澄清
    ("我想查点东西...", "clarify"),
    # 复杂任务
    ("查订单+退款政策+写邮件", "mixed"),
 ]
 for query, expected_intent in test_cases:
    result = await classifier.classify(query)
    print(f"{query} → {result.intent_type}")
 ```
 ## 扩展指南
 ### 添加新的快速路径
 ```python
 # 在 HybridRouter 中添加
 async def _execute_custom_path(self, user_input: str) -> str:
    # 自定义路径逻辑
    pass
 ```
 ### 添加缓存层
 ```python
 # 在 IntentClassifier 中添加缓存
 from functools import lru_cache
 class IntentClassifier:
    @lru_cache(maxsize=1000)
    async def classify_cached(self, user_input: str):
        # 缓存分类结果
        pass
 ```
 ## 注意事项
 1. 确保降级策略合理
 2. 监控意图分类准确率
 3. 根据实际情况调整置信度阈值
 4. 前端需要处理新的 SSE 事件
 5. 保持向后兼容