AI创想
标题:
LangGraph自适应RAG
[打印本页]
作者:
米落枫
时间:
12 小时前
标题:
LangGraph自适应RAG
作者:CSDN博客
LangGraph自适应RAG
介绍索引LLMsweb 搜索工具graph
graph stategraph flowbuild graph
执行
介绍
自适应 RAG 是一种 RAG 策略,它将 (1)
查询分析
(2)
主动/自校正 RAG
结合起来。
在文章中,他们报告了查询分析到路由获取:
No RetrievalSingle-shot RAGIterative RAG
让我们使用 LangGraph 在此基础上进行构建。
在我们的实现中,我们将在以下之间进行路由:
网络搜索:与最近事件相关的问题自校正 RAG:针对与我们的索引相关的问题
(, 下载次数: 0)
上传
点击文件名下载附件
索引
from typing import List
import requests
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Vearch
from langchain_core.embeddings import Embeddings
from langchain_core.pydantic_v1 import(
BaseModel
)from langchain_text_splitters import RecursiveCharacterTextSplitter
from common.constant import VEARCH_ROUTE_URL, BGE_M3_EMB_URL
classBgem3Embeddings(BaseModel, Embeddings):defembed_documents(self, texts: List[str])-> List[List[float]]:print(texts)return[]defembed_query(self, text:str)-> List[float]:ifnot text:return[]return cop_embeddings(text)"""
bg3m3转向量
"""defcop_embeddings(input:str)->list:ifnotinput.strip():return[]
headers ={
"Content-Type":"application/json"}
params ={
"sentences":[input],"type":"dense"}
response = requests.post(BGE_M3_EMB_URL, headers=headers, json=params)if response.status_code ==200:
cop_embeddings_result = response.json()ifnot cop_embeddings_result or'embeddings'notin cop_embeddings_result ornot cop_embeddings_result['embeddings']:return[]
original_vector = cop_embeddings_result['embeddings'][0]
original_size =len(original_vector)# 将1024的向量兼容为1536,适配openai向量接口
adaptor_vector =[0.0]*1536for i inrange(min(original_size,1536)):
adaptor_vector[i]= original_vector[i]return adaptor_vector
else:print(f"cop_embeddings error: {
response.text}")return[]# Docs to index
urls =["https://lilianweng.github.io/posts/2023-06-23-agent/","https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/","https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",]# 加载文档
docs =[WebBaseLoader(url).load()for url in urls]
docs_list =[item for sublist in docs for item in sublist]# 文档分块
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=500, chunk_overlap=0)
doc_splits = text_splitter.split_documents(docs_list)# 数据存储到向量库
embeddings_model = Bgem3Embeddings()# embeddings_model, VEARCH_ROUTE_URL,"lanchain_autogpt","lanchain_autogpt_db", 3,
vectorstore = Vearch.from_documents(
documents
复制代码
原文地址:https://blog.csdn.net/for62/article/details/139754427
欢迎光临 AI创想 (https://www.llms-ai.com/)
Powered by Discuz! X3.4