将 Weaviate 与 generative OpenAI 模块结合使用以进行生成式搜索

本笔记本是为以下场景准备的：

您的数据已在 Weaviate 中
您想要将 Weaviate 与 Generative OpenAI 模块（generative-openai）结合使用。

先决条件

本 cookbook 仅涵盖生成式搜索示例，但不包括配置和数据导入。

为了充分利用本 cookbook，请先完成入门 cookbook，您将在其中学习使用 Weaviate 的基本知识并导入演示数据。

检查清单

已完成入门 cookbook，
创建了 Weaviate 实例，
将数据导入到您的 Weaviate 实例中，
您拥有 OpenAI API 密钥

===========================================================

准备您的 OpenAI API 密钥

OpenAI API 密钥用于在导入时对您的数据进行向量化，以及用于运行查询。

如果您没有 OpenAI API 密钥，您可以从 https://beta.openai.com/account/api-keys 获取一个。

获取密钥后，请将其作为 OPENAI_API_KEY 添加到您的环境变量中。

# Test that your OpenAI API key is correctly set as an environment variable # Note. if you run this notebook locally, you will need to reload your terminal and the notebook for the env variables to be live. import os # Note. alternatively you can set a temporary env variable like this: # os.environ["OPENAI_API_KEY"] = 'your-key-goes-here' if os.getenv("OPENAI_API_KEY") is not None: print ("OPENAI_API_KEY is ready") else: print ("OPENAI_API_KEY environment variable not found")

连接到您的 Weaviate 实例

在本节中，我们将

测试环境变量 OPENAI_API_KEY – 确保您已完成 #Prepare-your-OpenAI-API-key 中的步骤
使用您的 OpenAI API 密钥连接到您的 Weaviate
并测试客户端连接

客户端

完成此步骤后，client 对象将用于执行所有与 Weaviate 相关的操作。

import weaviate from datasets import load_dataset import os # Connect to your Weaviate instance client = weaviate.Client( url="https://your-wcs-instance-name.weaviate.network/", # url="https://:8080/", auth_client_secret=weaviate.auth.AuthApiKey(api_key="<YOUR-WEAVIATE-API-KEY>"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances) additional_headers={ "X-OpenAI-Api-Key": os.getenv("OPENAI_API_KEY") } ) # Check if your instance is live and ready # This should return `True` client.is_ready()

生成式搜索

Weaviate 提供了一个生成式搜索 OpenAI 模块，该模块根据存储在您的 Weaviate 实例中的数据生成响应。

您构建生成式搜索查询的方式与 Weaviate 中的标准语义搜索查询非常相似。

例如

在“Articles”中搜索，
返回“title”、“content”、“url”
查找与“足球俱乐部”相关的对象
将结果限制为 5 个对象

    result = (
        client.query
        .get("Articles", ["title", "content", "url"])
        .with_near_text("concepts": "football clubs")
        .with_limit(5)
        # generative query will go here
        .do()
    )

现在，您可以添加 with_generate() 函数来应用生成式转换。 with_generate 接受以下任一参数：

single_prompt - 为每个返回的对象生成一个响应，
grouped_task – 从所有返回的对象生成一个单一响应。

def generative_search_per_item(query, collection_name): prompt = "Summarize in a short tweet the following content: {content}" result = ( client.query .get(collection_name, ["title", "content", "url"]) .with_near_text({ "concepts": [query], "distance": 0.7 }) .with_limit(5) .with_generate(single_prompt=prompt) .do() ) # Check for errors if ("errors" in result): print ("\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.") raise Exception(result["errors"][0]['message']) return result["data"]["Get"][collection_name]

query_result = generative_search_per_item("football clubs", "Article") for i, article in enumerate(query_result): print(f"{i+1}. { article['title']}") print(article['_additional']['generate']['singleResult']) # print generated response print("-----------------------")

def generative_search_group(query, collection_name): generateTask = "Explain what these have in common" result = ( client.query .get(collection_name, ["title", "content", "url"]) .with_near_text({ "concepts": [query], "distance": 0.7 }) .with_generate(grouped_task=generateTask) .with_limit(5) .do() ) # Check for errors if ("errors" in result): print ("\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.") raise Exception(result["errors"][0]['message']) return result["data"]["Get"][collection_name]

感谢您的关注，您现在已经准备好设置自己的向量数据库并使用嵌入来完成各种很酷的事情 - 尽情享受吧！对于更复杂的用例，请继续学习本仓库中的其他 cookbook 示例。

将 Weaviate 与生成式 OpenAI 模块结合使用以进行生成式搜索

先决条件

准备您的 OpenAI API 密钥

连接到您的 Weaviate 实例

客户端

生成式搜索