AI创想

标题: LangGraph自适应RAG [打印本页]

作者: 米落枫    时间: 12 小时前
标题: LangGraph自适应RAG
作者:CSDN博客
LangGraph自适应RAG


介绍

自适应 RAG 是一种 RAG 策略,它将 (1) 查询分析 (2) 主动/自校正 RAG 结合起来。
在文章中,他们报告了查询分析到路由获取:
让我们使用 LangGraph 在此基础上进行构建。
在我们的实现中,我们将在以下之间进行路由:
索引
  1. from typing import List
  2. import requests
  3. from langchain_community.document_loaders import WebBaseLoader
  4. from langchain_community.vectorstores import Vearch
  5. from langchain_core.embeddings import Embeddings
  6. from langchain_core.pydantic_v1 import(
  7.     BaseModel
  8. )from langchain_text_splitters import RecursiveCharacterTextSplitter
  9. from common.constant import VEARCH_ROUTE_URL, BGE_M3_EMB_URL
  10. classBgem3Embeddings(BaseModel, Embeddings):defembed_documents(self, texts: List[str])-> List[List[float]]:print(texts)return[]defembed_query(self, text:str)-> List[float]:ifnot text:return[]return cop_embeddings(text)"""
  11. bg3m3转向量
  12. """defcop_embeddings(input:str)->list:ifnotinput.strip():return[]
  13.     headers ={
  14.    
  15.    "Content-Type":"application/json"}
  16.     params ={
  17.    
  18.    "sentences":[input],"type":"dense"}
  19.     response = requests.post(BGE_M3_EMB_URL, headers=headers, json=params)if response.status_code ==200:
  20.         cop_embeddings_result = response.json()ifnot cop_embeddings_result or'embeddings'notin cop_embeddings_result ornot cop_embeddings_result['embeddings']:return[]
  21.         original_vector = cop_embeddings_result['embeddings'][0]
  22.         original_size =len(original_vector)# 将1024的向量兼容为1536,适配openai向量接口
  23.         adaptor_vector =[0.0]*1536for i inrange(min(original_size,1536)):
  24.             adaptor_vector[i]= original_vector[i]return adaptor_vector
  25.     else:print(f"cop_embeddings error: {
  26.      
  27.      response.text}")return[]# Docs to index
  28. urls =["https://lilianweng.github.io/posts/2023-06-23-agent/","https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/","https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",]# 加载文档
  29. docs =[WebBaseLoader(url).load()for url in urls]
  30. docs_list =[item for sublist in docs for item in sublist]# 文档分块
  31. text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
  32.     chunk_size=500, chunk_overlap=0)
  33. doc_splits = text_splitter.split_documents(docs_list)# 数据存储到向量库
  34. embeddings_model = Bgem3Embeddings()# embeddings_model, VEARCH_ROUTE_URL,"lanchain_autogpt","lanchain_autogpt_db", 3,
  35. vectorstore = Vearch.from_documents(
  36.     documents
复制代码
原文地址:https://blog.csdn.net/for62/article/details/139754427




欢迎光临 AI创想 (https://www.llms-ai.com/) Powered by Discuz! X3.4