使用 RAG 方法和 OpenAI Responses API 的多工具编排

2025 年 3 月 28 日
在 Github 中打开

本 Cookbook 指导您使用 OpenAI 的 Responses API 构建动态、多工具工作流程。它演示了如何实现检索增强生成 (RAG) 方法,该方法智能地将用户查询路由到适当的内置或外部工具。无论您的查询需要通用知识,还是需要访问来自向量数据库(如 Pinecone)的特定内部上下文,本指南都将向您展示如何集成函数调用、内置工具 Web 搜索以及利用文档检索来生成准确、上下文相关的响应。

有关使用 Responses API 的文件搜索功能在 PDF 上执行 RAG 的实际示例,请参阅笔记本。

此示例展示了 Responses API 的灵活性,说明除了连接到内部向量存储的内部 file_search 工具之外,还可以轻松连接到外部向量数据库。这允许结合托管工具实施 RAG 方法,为各种检索和生成任务提供通用的解决方案。

#%pip install datasets tqdm pandas pinecone openai --quiet

import os
import time
from tqdm.auto import tqdm
from pandas import DataFrame
from datasets import load_dataset
import random
import string


# Import OpenAI client and initialize with your API key.
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Import Pinecone client and related specifications.
from pinecone import Pinecone
from pinecone import ServerlessSpec
[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
/Users/shikhar/openai_projects/github_repos/success-git/success_new/success/oneoffs/shikhar/responses_rag_cookbook/env/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

在此示例中,我们使用了来自 Hugging Face 的示例医学推理数据集。我们将数据集转换为 Pandas DataFrame,并将“Question”和“Response”列合并为单个字符串。此合并文本用于嵌入,稍后存储为元数据。

# Load the dataset (ensure you're logged in with huggingface-cli if needed)
ds = load_dataset("FreedomIntelligence/medical-o1-reasoning-SFT", "en", split='train[:100]', trust_remote_code=True)
ds_dataframe = DataFrame(ds)

# Merge the Question and Response columns into a single string.
ds_dataframe['merged'] = ds_dataframe.apply(
    lambda row: f"Question: {row['Question']} Answer: {row['Response']}", axis=1
)
print("Example merged text:", ds_dataframe['merged'].iloc[0])
Example merged text: Question: A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions? Answer: Cystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.
ds_dataframe
问题 复杂_CoT 回应 合并
0 一位有长期病史的 61 岁女性... 好的,让我们逐步思考一下。这... 在这种压力性尿失禁病例中进行膀胱测压... 问题:一位有长期病史的 61 岁女性...
1 一位有酗酒史的 45 岁男性... 好的,让我们分解一下。我们有一位 45 岁的... 考虑到突发临床表现... 问题:一位有酗酒史的 45 岁男性...
2 一位 45 岁男性,症状包括... 好的,这里有一位 45 岁的男性,他正在经历... 根据提出的临床发现——广泛性... 问题:一位 45 岁男性,症状包括...
3 一位患有牛皮癣的患者接受了系统性治疗... 我正在考虑这位患有牛皮癣的患者... 泛发性脓疱的发展... 问题:一位患有牛皮癣的患者接受了治疗...
4 对于一位 2 岁儿童,最可能的诊断是什么? 好的,所以我们正在处理一位 2 岁的孩子... 根据描述的症状和不寻常的... 问题:对于一位 2 岁儿童,最可能的诊断是什么?
... ... ... ... ...
95 电流沿着平板流动... 好的,为了找出中心点的温度... 正确答案是 F。1549°F。 问题:电流沿着 ... 流动
96 一位被毒蛇咬伤的爬虫学家... 好的,所以我们正在处理一个案例,其中一位 ... 蛇毒最有可能影响... 问题:一位被毒蛇咬伤的爬虫学家...
97 一位 34 岁的人正在迅速发展 c... 好的,让我们分解一下正在发生的事情... 问题中描述的症状最符合... 问题:一位 34 岁的人正在迅速发展 de...
98 用于描述 ... 类型伤害的术语是什么? 好的,所以我需要弄清楚什么样的 inj... 用于描述 ... 类型伤害的术语是 c... 问题:用于描述 ... 的术语是什么?
99 在水氯化过程中,t... 好的,让我们从 ... 开始思考这个问题 在 c... 过程中有效的消毒作用 问题:在水氯化过程中 o...

100 行 × 4 列

MODEL = "text-embedding-3-small"  # Replace with your production embedding model if needed
# Compute an embedding for the first document to obtain the embedding dimension.
sample_embedding_resp = client.embeddings.create(
    input=[ds_dataframe['merged'].iloc[0]],
    model=MODEL
)
embed_dim = len(sample_embedding_resp.data[0].embedding)
print(f"Embedding dimension: {embed_dim}")
Embedding dimension: 1536

# Initialize Pinecone using your API key.
pc = Pinecone(api_key=os.getenv("PINECONE_API_KEY"))

# Define the Pinecone serverless specification.
AWS_REGION = "us-east-1"
spec = ServerlessSpec(cloud="aws", region=AWS_REGION)

# Create a random index name with lower case alphanumeric characters and '-'
index_name = 'pinecone-index-' + ''.join(random.choices(string.ascii_lowercase + string.digits, k=10))

# Create the index if it doesn't already exist.
if index_name not in pc.list_indexes().names():
    pc.create_index(
        index_name,
        dimension=embed_dim,
        metric='dotproduct',
        spec=spec
    )

# Connect to the index.
index = pc.Index(index_name)
time.sleep(1)
print("Index stats:", index.describe_index_stats())
Index stats: {'dimension': 1536,
 'index_fullness': 0.0,
 'metric': 'dotproduct',
 'namespaces': {},
 'total_vector_count': 0,
 'vector_type': 'dense'}

将数据集 Upsert 到 Pinecone 索引中

批量处理数据集,为每个合并的文本生成嵌入,准备元数据(包括单独的问题和答案字段),并将每个批次 upsert 到索引中。如果需要,您还可以更新特定条目的元数据。

batch_size = 32
for i in tqdm(range(0, len(ds_dataframe['merged']), batch_size), desc="Upserting to Pinecone"):
    i_end = min(i + batch_size, len(ds_dataframe['merged']))
    lines_batch = ds_dataframe['merged'][i: i_end]
    ids_batch = [str(n) for n in range(i, i_end)]
    
    # Create embeddings for the current batch.
    res = client.embeddings.create(input=[line for line in lines_batch], model=MODEL)
    embeds = [record.embedding for record in res.data]
    
    # Prepare metadata by extracting original Question and Answer.
    meta = []
    for record in ds_dataframe.iloc[i:i_end].to_dict('records'):
        q_text = record['Question']
        a_text = record['Response']
        # Optionally update metadata for specific entries.
        meta.append({"Question": q_text, "Answer": a_text})
    
    # Upsert the batch into Pinecone.
    vectors = list(zip(ids_batch, embeds, meta))
    index.upsert(vectors=vectors)
Upserting to Pinecone: 100%|██████████| 4/4 [00:06<00:00,  1.64s/it]

Pinecone Image

查询 Pinecone 索引

创建自然语言查询,计算其嵌入,并在 Pinecone 索引上执行相似性搜索。返回的结果包括元数据,这些元数据为生成答案提供上下文。

def query_pinecone_index(client, index, model, query_text):
    # Generate an embedding for the query.
    query_embedding = client.embeddings.create(input=query_text, model=model).data[0].embedding

    # Query the index and return top 5 matches.
    res = index.query(vector=[query_embedding], top_k=5, include_metadata=True)
    print("Query Results:")
    for match in res['matches']:
        print(f"{match['score']:.2f}: {match['metadata'].get('Question', 'N/A')} - {match['metadata'].get('Answer', 'N/A')}")
    return res
# Example usage with a different query from the train/test set
query = (
    "A 45-year-old man with a history of alcohol use presents with symptoms including confusion, ataxia, and ophthalmoplegia. "
    "What is the most likely diagnosis and the recommended treatment?"
)
query_pinecone_index(client, index, MODEL, query)
Query Results:
0.70: A 45-year-old man with a history of alcohol use, who has been abstinent for the past 10 years, presents with sudden onset dysarthria, shuffling gait, and intention tremors. Given this clinical presentation and history, what is the most likely diagnosis? - Considering the clinical presentation of sudden onset dysarthria, shuffling gait, and intention tremors in a 45-year-old man with a history of alcohol use who has been abstinent for the past 10 years, the most likely diagnosis is acquired hepatocerebral degeneration.

This condition is associated with chronic liver disease, which can often be a consequence of long-term alcohol use. Despite the patient's abstinence from alcohol for a decade, previous alcohol use may have led to underlying liver dysfunction. This dysfunction, even if subclinical, can cause encephalopathy due to the accumulation of neurotoxic substances that affect the brain. The sudden onset of these neurological symptoms aligns with how acquired hepatocerebral degeneration can manifest, making it a probable diagnosis in this scenario.
0.55: A 45-year-old man presents with symptoms including a wide-based gait, a blank facial expression, hallucinations, memory issues, a resting tremor that resolves with movement, and bradykinesia. Based on these clinical findings, what is most likely to be observed in the histological specimen of his brain? - Based on the clinical findings presented—wide-based gait, blank facial expression, hallucinations, memory issues, resting tremor that resolves with movement, and bradykinesia—it is likely that the 45-year-old man is experiencing a condition related to Parkinsonism, possibly Parkinson's disease or dementia with Lewy bodies. Both of these conditions are associated with the presence of Lewy bodies in the brain. Lewy bodies are abnormal aggregates of protein, primarily alpha-synuclein, which can cause both the motor and cognitive symptoms observed in this patient. Therefore, in the histological specimen of his brain, you would most likely observe the presence of Lewy bodies.
0.53: A 73-year-old man is evaluated for increasing forgetfulness, getting lost while walking, irritability, and difficulty recalling recent events while retaining detailed memories from over 20 years ago. On examination, he is oriented to person and place but disoriented to time, and an MRI of the brain reveals significant changes. Considering these symptoms and the imaging findings, what is the most likely underlying pathological process contributing to the patient's condition? - The symptoms and MRI findings of this 73-year-old man suggest the most likely underlying pathological process is the buildup of amyloid-beta plaques and tau protein tangles, which are characteristic of Alzheimer's disease. These changes often begin in brain regions involved in memory, such as the hippocampus and temporal lobes, leading to the gradual memory decline, disorientation, and personality changes observed in the patient.
0.42: A 2-day-old male newborn delivered at 36 weeks presents with generalized convulsions, lethargy, feeding difficulties, icterus, purpura, posterior uveitis, and failed auditory screening. Cranial ultrasonography shows ventricular dilatation and hyperechoic foci in multiple brain areas. Considering these clinical signs and history, what is the most likely diagnosis? - The symptoms and findings you've described in this 2-day-old newborn point towards congenital Toxoplasmosis. The combination of neurological symptoms (such as convulsions and ventricular dilatation with hyperechoic foci), the presence of posterior uveitis, and the skin manifestations like purpura, all fit into the classic presentation of a TORCH infection. Toxoplasmosis, specifically, is known to cause widespread calcifications in the brain, not limited to the periventricular areas, which matches the ultrasound findings. Additionally, while hearing loss is more traditionally associated with CMV, it can also occur in Toxoplasmosis. Thus, the most likely diagnosis given this clinical picture is congenital Toxoplasmosis.
0.42: A 45-year-old male patient experiences double vision specifically when walking upstairs. Considering his well-controlled history of Type-II diabetes, which cranial nerve is most likely involved in his symptoms? - Based on the symptoms described, the cranial nerve most likely involved in the double vision experienced by this patient while walking upstairs is the trochlear nerve, or cranial nerve IV. This nerve controls the superior oblique muscle, which plays a role in stabilizing the eye during certain movements, including the coordination required when looking upwards while walking upstairs. Given the patient's history of diabetes, cranial neuropathies can occur, and CN IV involvement can lead to vertical diplopia that becomes noticeable during specific activities like walking up stairs. Therefore, the trochlear nerve is a likely candidate for the involvement in these symptoms.
{'matches': [{'id': '1',
              'metadata': {'Answer': 'Considering the clinical presentation of '
                                     'sudden onset dysarthria, shuffling gait, '
                                     'and intention tremors in a 45-year-old '
                                     'man with a history of alcohol use who '
                                     'has been abstinent for the past 10 '
                                     'years, the most likely diagnosis is '
                                     'acquired hepatocerebral degeneration.\n'
                                     '\n'
                                     'This condition is associated with '
                                     'chronic liver disease, which can often '
                                     'be a consequence of long-term alcohol '
                                     "use. Despite the patient's abstinence "
                                     'from alcohol for a decade, previous '
                                     'alcohol use may have led to underlying '
                                     'liver dysfunction. This dysfunction, '
                                     'even if subclinical, can cause '
                                     'encephalopathy due to the accumulation '
                                     'of neurotoxic substances that affect the '
                                     'brain. The sudden onset of these '
                                     'neurological symptoms aligns with how '
                                     'acquired hepatocerebral degeneration can '
                                     'manifest, making it a probable diagnosis '
                                     'in this scenario.',
                           'Question': 'A 45-year-old man with a history of '
                                       'alcohol use, who has been abstinent '
                                       'for the past 10 years, presents with '
                                       'sudden onset dysarthria, shuffling '
                                       'gait, and intention tremors. Given '
                                       'this clinical presentation and '
                                       'history, what is the most likely '
                                       'diagnosis?'},
              'score': 0.697534442,
              'values': []},
             {'id': '2',
              'metadata': {'Answer': 'Based on the clinical findings '
                                     'presented—wide-based gait, blank facial '
                                     'expression, hallucinations, memory '
                                     'issues, resting tremor that resolves '
                                     'with movement, and bradykinesia—it is '
                                     'likely that the 45-year-old man is '
                                     'experiencing a condition related to '
                                     "Parkinsonism, possibly Parkinson's "
                                     'disease or dementia with Lewy bodies. '
                                     'Both of these conditions are associated '
                                     'with the presence of Lewy bodies in the '
                                     'brain. Lewy bodies are abnormal '
                                     'aggregates of protein, primarily '
                                     'alpha-synuclein, which can cause both '
                                     'the motor and cognitive symptoms '
                                     'observed in this patient. Therefore, in '
                                     'the histological specimen of his brain, '
                                     'you would most likely observe the '
                                     'presence of Lewy bodies.',
                           'Question': 'A 45-year-old man presents with '
                                       'symptoms including a wide-based gait, '
                                       'a blank facial expression, '
                                       'hallucinations, memory issues, a '
                                       'resting tremor that resolves with '
                                       'movement, and bradykinesia. Based on '
                                       'these clinical findings, what is most '
                                       'likely to be observed in the '
                                       'histological specimen of his brain?'},
              'score': 0.55345,
              'values': []},
             {'id': '19',
              'metadata': {'Answer': 'The symptoms and MRI findings of this '
                                     '73-year-old man suggest the most likely '
                                     'underlying pathological process is the '
                                     'buildup of amyloid-beta plaques and tau '
                                     'protein tangles, which are '
                                     "characteristic of Alzheimer's disease. "
                                     'These changes often begin in brain '
                                     'regions involved in memory, such as the '
                                     'hippocampus and temporal lobes, leading '
                                     'to the gradual memory decline, '
                                     'disorientation, and personality changes '
                                     'observed in the patient.',
                           'Question': 'A 73-year-old man is evaluated for '
                                       'increasing forgetfulness, getting lost '
                                       'while walking, irritability, and '
                                       'difficulty recalling recent events '
                                       'while retaining detailed memories from '
                                       'over 20 years ago. On examination, he '
                                       'is oriented to person and place but '
                                       'disoriented to time, and an MRI of the '
                                       'brain reveals significant changes. '
                                       'Considering these symptoms and the '
                                       'imaging findings, what is the most '
                                       'likely underlying pathological process '
                                       "contributing to the patient's "
                                       'condition?'},
              'score': 0.526201367,
              'values': []},
             {'id': '38',
              'metadata': {'Answer': "The symptoms and findings you've "
                                     'described in this 2-day-old newborn '
                                     'point towards congenital Toxoplasmosis. '
                                     'The combination of neurological symptoms '
                                     '(such as convulsions and ventricular '
                                     'dilatation with hyperechoic foci), the '
                                     'presence of posterior uveitis, and the '
                                     'skin manifestations like purpura, all '
                                     'fit into the classic presentation of a '
                                     'TORCH infection. Toxoplasmosis, '
                                     'specifically, is known to cause '
                                     'widespread calcifications in the brain, '
                                     'not limited to the periventricular '
                                     'areas, which matches the ultrasound '
                                     'findings. Additionally, while hearing '
                                     'loss is more traditionally associated '
                                     'with CMV, it can also occur in '
                                     'Toxoplasmosis. Thus, the most likely '
                                     'diagnosis given this clinical picture is '
                                     'congenital Toxoplasmosis.',
                           'Question': 'A 2-day-old male newborn delivered at '
                                       '36 weeks presents with generalized '
                                       'convulsions, lethargy, feeding '
                                       'difficulties, icterus, purpura, '
                                       'posterior uveitis, and failed auditory '
                                       'screening. Cranial ultrasonography '
                                       'shows ventricular dilatation and '
                                       'hyperechoic foci in multiple brain '
                                       'areas. Considering these clinical '
                                       'signs and history, what is the most '
                                       'likely diagnosis?'},
              'score': 0.422916651,
              'values': []},
             {'id': '31',
              'metadata': {'Answer': 'Based on the symptoms described, the '
                                     'cranial nerve most likely involved in '
                                     'the double vision experienced by this '
                                     'patient while walking upstairs is the '
                                     'trochlear nerve, or cranial nerve IV. '
                                     'This nerve controls the superior oblique '
                                     'muscle, which plays a role in '
                                     'stabilizing the eye during certain '
                                     'movements, including the coordination '
                                     'required when looking upwards while '
                                     "walking upstairs. Given the patient's "
                                     'history of diabetes, cranial '
                                     'neuropathies can occur, and CN IV '
                                     'involvement can lead to vertical '
                                     'diplopia that becomes noticeable during '
                                     'specific activities like walking up '
                                     'stairs. Therefore, the trochlear nerve '
                                     'is a likely candidate for the '
                                     'involvement in these symptoms.',
                           'Question': 'A 45-year-old male patient experiences '
                                       'double vision specifically when '
                                       'walking upstairs. Considering his '
                                       'well-controlled history of Type-II '
                                       'diabetes, which cranial nerve is most '
                                       'likely involved in his symptoms?'},
              'score': 0.420719624,
              'values': []}],
 'namespace': '',
 'usage': {'read_units': 6}}

使用检索到的上下文生成响应

从您的查询结果中选择最佳匹配结果,并使用 OpenAI Responses API 通过将检索到的上下文与原始问题相结合来生成最终答案。

# Retrieve and concatenate top 3 match contexts.
matches = index.query(
    vector=[client.embeddings.create(input=query, model=MODEL).data[0].embedding],
    top_k=3,
    include_metadata=True
)['matches']

context = "\n\n".join(
    f"Question: {m['metadata'].get('Question', '')}\nAnswer: {m['metadata'].get('Answer', '')}"
    for m in matches
)
# Use the context to generate a final answer.
response = client.responses.create(
    model="gpt-4o",
    input=f"Provide the answer based on the context: {context} and the question: {query} as per the internal knowledge base",
)
print("\nFinal Answer:")
print(response.output_text)
Final Answer:
The presentation of confusion, ataxia, and ophthalmoplegia in a 45-year-old man with a history of alcohol use is suggestive of Wernicke's encephalopathy. This condition is caused by thiamine (vitamin B1) deficiency, often associated with chronic alcohol use.

The recommended treatment is the immediate administration of thiamine, typically given intravenously or intramuscularly, to prevent progression to more severe neurological damage or Korsakoff syndrome.

编排多工具调用

现在,我们将定义通过 Responses API 提供的内置函数,包括调用外部向量存储 - Pinecone 的能力,作为一个示例。

Web 搜索预览工具:使模型能够执行实时 Web 搜索并预览结果。这非常适合从互联网检索实时或最新的信息。

Pinecone 搜索工具:允许模型使用语义搜索查询向量数据库。这对于检索相关文档(如医学文献或其他领域特定的内容)非常有用,这些文档已以向量化格式存储。

# Tools definition: The list of tools includes:
# - A web search preview tool.
# - A Pinecone search tool for retrieving medical documents.

# Define available tools.
tools = [   
    {"type": "web_search_preview",
      "user_location": {
        "type": "approximate",
        "country": "US",
        "region": "California",
        "city": "SF"
      },
      "search_context_size": "medium"},
    {
        "type": "function",
        "name": "PineconeSearchDocuments",
        "description": "Search for relevant documents based on the medical question asked by the user that is stored within the vector database using a semantic query.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The natural language query to search the vector database."
                },
                "top_k": {
                    "type": "integer",
                    "description": "Number of top results to return.",
                    "default": 3
                }
            },
            "required": ["query"],
            "additionalProperties": False
        }
    }
]
# Example queries that the model should route appropriately.
queries = [
    {"query": "Who won the cricket world cup in 1983?"},
    {"query": "What is the most common cause of death in the United States according to the internet?"},
    {"query": ("A 7-year-old boy with sickle cell disease is experiencing knee and hip pain, "
               "has been admitted for pain crises in the past, and now walks with a limp. "
               "His exam shows a normal, cool hip with decreased range of motion and pain with ambulation. "
               "What is the most appropriate next step in management according to the internal knowledge base?")}
]
# Process each query dynamically.
for item in queries:
    input_messages = [{"role": "user", "content": item["query"]}]
    print("\n🌟--- Processing Query ---🌟")
    print(f"🔍 **User Query:** {item['query']}")
    
    # Call the Responses API with tools enabled and allow parallel tool calls.
    response = client.responses.create(
        model="gpt-4o",
        input=[
            {"role": "system", "content": "When prompted with a question, select the right tool to use based on the question."
            },
            {"role": "user", "content": item["query"]}
        ],
        tools=tools,
        parallel_tool_calls=True
    )
    
    print("\n✨ **Initial Response Output:**")
    print(response.output)
    
    # Determine if a tool call is needed and process accordingly.
    if response.output:
        tool_call = response.output[0]
        if tool_call.type in ["web_search_preview", "function_call"]:
            tool_name = tool_call.name if tool_call.type == "function_call" else "web_search_preview"
            print(f"\n🔧 **Model triggered a tool call:** {tool_name}")
            
            if tool_name == "PineconeSearchDocuments":
                print("🔍 **Invoking PineconeSearchDocuments tool...**")
                res = query_pinecone_index(client, index, MODEL, item["query"])
                if res["matches"]:
                    best_match = res["matches"][0]["metadata"]
                    result = f"**Question:** {best_match.get('Question', 'N/A')}\n**Answer:** {best_match.get('Answer', 'N/A')}"
                else:
                    result = "**No matching documents found in the index.**"
                print("✅ **PineconeSearchDocuments tool invoked successfully.**")
            else:
                print("🔍 **Invoking simulated web search tool...**")
                result = "**Simulated web search result.**"
                print("✅ **Simulated web search tool invoked successfully.**")
            
            # Append the tool call and its output back into the conversation.
            input_messages.append(tool_call)
            input_messages.append({
                "type": "function_call_output",
                "call_id": tool_call.call_id,
                "output": str(result)
            })
            
            # Get the final answer incorporating the tool's result.
            final_response = client.responses.create(
                model="gpt-4o",
                input=input_messages,
                tools=tools,
                parallel_tool_calls=True
            )
            print("\n💡 **Final Answer:**")
            print(final_response.output_text)
        else:
            # If no tool call is triggered, print the response directly.
            print("💡 **Final Answer:**")
            print(response.output_text)
🌟--- Processing Query ---🌟
🔍 **User Query:** Who won the cricket world cup in 1983?

✨ **Initial Response Output:**
[ResponseOutputMessage(id='msg_67e6e7a9f7508191a9d18c3ff25310290811a0720cf47168', content=[ResponseOutputText(annotations=[], text='India won the Cricket World Cup in 1983.', type='output_text')], role='assistant', status='completed', type='message')]
💡 **Final Answer:**
India won the Cricket World Cup in 1983.

🌟--- Processing Query ---🌟
🔍 **User Query:** What is the most common cause of death in the United States according to the internet?

✨ **Initial Response Output:**
[ResponseFunctionWebSearch(id='ws_67e6e7aad0248191ab974d4b09b460c90537f90023d2dd32', status='completed', type='web_search_call'), ResponseOutputMessage(id='msg_67e6e7ace08081918f06b5cac32e8c0e0537f90023d2dd32', content=[ResponseOutputText(annotations=[AnnotationURLCitation(end_index=363, start_index=225, title='10 Leading Causes of Death in the U.S.', type='url_citation', url='https://www.usnews.com/news/healthiest-communities/slideshows/top-10-causes-of-death-in-america?slide=11&utm_source=openai'), AnnotationURLCitation(end_index=753, start_index=625, title='Top causes of death in the US — see the CDC’s latest list - Rifnote', type='url_citation', url='https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai'), AnnotationURLCitation(end_index=1014, start_index=886, title='Top causes of death in the US — see the CDC’s latest list - Rifnote', type='url_citation', url='https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai'), AnnotationURLCitation(end_index=1216, start_index=1061, title='US deaths are down and life expectancy is up, but improvements are slowing', type='url_citation', url='https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai'), AnnotationURLCitation(end_index=1394, start_index=1219, title='A Mysterious Health Wave Is Breaking Out Across the U.S.', type='url_citation', url='https://www.theatlantic.com/ideas/archive/2024/12/violence-obesity-overdoses-health-covid/681079/?utm_source=openai')], text='According to the Centers for Disease Control and Prevention (CDC), heart disease was the leading cause of death in the United States in 2023, accounting for 680,980 deaths, which is approximately 22% of all deaths that year. ([usnews.com](https://www.usnews.com/news/healthiest-communities/slideshows/top-10-causes-of-death-in-america?slide=11&utm_source=openai))\n\nThe top 10 causes of death in the U.S. for 2023 were:\n\n1. Heart disease\n2. Cancer\n3. Unintentional injury\n4. Stroke\n5. Chronic lower respiratory diseases\n6. Alzheimer’s disease\n7. Diabetes\n8. Kidney disease\n9. Chronic liver disease and cirrhosis\n10. COVID-19\n\n([rifnote.com](https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai))\n\nNotably, COVID-19, which was the fourth leading cause of death in 2022, dropped to the tenth position in 2023, with 76,446 deaths. ([rifnote.com](https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai))\n\n\n## Recent Trends in U.S. Mortality Rates:\n- [US deaths are down and life expectancy is up, but improvements are slowing](https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai)\n- [A Mysterious Health Wave Is Breaking Out Across the U.S.](https://www.theatlantic.com/ideas/archive/2024/12/violence-obesity-overdoses-health-covid/681079/?utm_source=openai) ', type='output_text')], role='assistant', status='completed', type='message')]
💡 **Final Answer:**
According to the Centers for Disease Control and Prevention (CDC), heart disease was the leading cause of death in the United States in 2023, accounting for 680,980 deaths, which is approximately 22% of all deaths that year. ([usnews.com](https://www.usnews.com/news/healthiest-communities/slideshows/top-10-causes-of-death-in-america?slide=11&utm_source=openai))

The top 10 causes of death in the U.S. for 2023 were:

1. Heart disease
2. Cancer
3. Unintentional injury
4. Stroke
5. Chronic lower respiratory diseases
6. Alzheimer’s disease
7. Diabetes
8. Kidney disease
9. Chronic liver disease and cirrhosis
10. COVID-19

([rifnote.com](https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai))

Notably, COVID-19, which was the fourth leading cause of death in 2022, dropped to the tenth position in 2023, with 76,446 deaths. ([rifnote.com](https://rifnote.com/health/2024/08/11/top-causes-of-death-in-the-us-see-the-cdcs-latest-list/?utm_source=openai))


## Recent Trends in U.S. Mortality Rates:
- [US deaths are down and life expectancy is up, but improvements are slowing](https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai)
- [A Mysterious Health Wave Is Breaking Out Across the U.S.](https://www.theatlantic.com/ideas/archive/2024/12/violence-obesity-overdoses-health-covid/681079/?utm_source=openai) 

🌟--- Processing Query ---🌟
🔍 **User Query:** A 7-year-old boy with sickle cell disease is experiencing knee and hip pain, has been admitted for pain crises in the past, and now walks with a limp. His exam shows a normal, cool hip with decreased range of motion and pain with ambulation. What is the most appropriate next step in management according to the internal knowledge base?

✨ **Initial Response Output:**
[ResponseFunctionToolCall(arguments='{"query":"7-year-old sickle cell disease knee hip pain limp normal cool hip decreased range of motion"}', call_id='call_ds0ETZbYtX71U2bQZXTBEWxN', name='PineconeSearchDocuments', type='function_call', id='fc_67e6e7b03ee48191bb400c13c359c35e0aeeec60d0806312', status='completed')]

🔧 **Model triggered a tool call:** PineconeSearchDocuments
🔍 **Invoking PineconeSearchDocuments tool...**
Query Results:
0.87: A 7-year-old boy with sickle cell disease is experiencing knee and hip pain, has been admitted for pain crises in the past, and now walks with a limp. His physical exam shows a normal and cool hip to the touch, with decreased range of motion at the hip and pain with ambulation. Given these findings, what is the most appropriate next step in the management of this patient's hip pain? - In managing the hip pain of a 7-year-old boy with sickle cell disease, who presents with knee and hip pain, a limp, and decreased range of motion in the hip, the most appropriate next step is to obtain an X-ray of the hip. This will help evaluate the possibility of avascular necrosis (AVN) or other structural abnormalities. X-rays are typically the first-line imaging technique in such cases due to their accessibility and ability to reveal gross pathological changes. If the X-ray does not provide conclusive information and clinical suspicion of AVN remains high, an MRI may subsequently be considered for a more detailed assessment.
0.55: What is the most likely diagnosis for a 2-year-old 70 kg child who presents with limitation of abduction and internal rotation, tenderness in Scarpa's triangle, and abduction of the limb upon flexing the hip? - Based on the described symptoms and the unusual weight for a 2-year-old child, the most likely diagnosis is Slipped Capital Femoral Epiphysis (SCFE). Even though SCFE typically occurs in older children, mainly adolescents, the combination of excessive weight, limited hip abduction and internal rotation, tenderness in the hip area, and the characteristic limb movement (abduction upon hip flexion) strongly points towards SCFE as the most plausible diagnosis in this scenario.
0.48: A 5-year-old boy has recurrent abdominal pain primarily occurring during school hours, with no significant findings on physical examination and normal stool characteristics. His symptoms resolve at home, and his laboratory tests and abdominal exam are unremarkable. Considering the psychological factors involved, what is the most appropriate next step in managing his condition? - Given the symptoms and the context you've provided, it seems quite possible that the boy's recurrent abdominal pain is linked to psychological stressors related to school. Since all medical tests and examinations have returned normal results, this suggests that the pain might be stress-induced, possibly due to anxiety or stress at school. 

The most appropriate next step is to focus on addressing any potential psychological or emotional factors. Consulting a psychologist or school counselor would be beneficial. They can work with the boy to explore any underlying emotional issues or anxieties about school. Through conversation, play, or other therapeutic techniques suitable for his age, they can help identify and manage any stressors he might be facing. This approach could not only help alleviate his abdominal pain but also improve his overall well-being by addressing the source of his anxiety.
0.44: In a patient who, five days post-open colectomy for colon cancer, develops severe pain and swelling of the left calf along with necrotic lesions, a fever, and thrombocytopenia while on unfractionated heparin, what is the most appropriate next step in management? - In this clinical scenario, the presentation of severe pain and swelling in the calf, necrotic skin lesions, fever, and thrombocytopenia in a patient receiving unfractionated heparin strongly suggests heparin-induced thrombocytopenia (HIT). HIT is a prothrombotic disorder caused by antibodies against heparin-platelet factor 4 complexes, leading to platelet activation, thrombocytopenia, and an increased risk of thrombosis.

The most appropriate next step in management is to immediately discontinue the unfractionated heparin to prevent further complications related to thrombosis. Simultaneously, it's crucial to initiate an alternative anticoagulant that does not cross-react with HIT antibodies to manage the thrombotic risk. Argatroban or fondaparinux are commonly used anticoagulants in this context as they are safe and effective for patients with HIT. Direct-acting oral anticoagulants (DOACs) are also potential options, but argatroban is often preferred initially due to its intravenous route and ability to be titrated easily in acute care settings. This dual approach addresses both the cause and the risk effectively.
0.44: In a patient with sickle cell anaemia presenting with multiple non-suppurative osteomyelitic dactylitis, what is the most likely causative organism? - In a patient with sickle cell anemia presenting with multiple non-suppurative osteomyelitic dactylitis, the most likely causative organism is Salmonella species. In individuals with sickle cell disease, Salmonella is particularly notorious for causing osteomyelitis. The relationship between sickle cell anemia and Salmonella infections, especially in the bone, is well-documented, and their presentations can often be less typical and less suppurative than those caused by other common bacteria like Staphylococcus aureus.
✅ **PineconeSearchDocuments tool invoked successfully.**

💡 **Final Answer:**
The most appropriate next step in the management of this 7-year-old boy with sickle cell disease and hip pain is to obtain an X-ray of the hip. This will help evaluate for potential avascular necrosis or other structural issues. If the X-ray is inconclusive and there is still a high suspicion of avascular necrosis, further imaging with an MRI may be considered.

如上所示,根据查询,将调用适当的工具以确定最佳响应。

例如,查看第三个示例,当模型触发名为“PineconeSearchDocuments”的工具时,代码会使用当前查询调用 query_pinecone_index,然后提取最佳匹配项(或适当的上下文)作为结果。对于非健康相关的查询或明确要求互联网搜索的查询,代码会调用 web_search_call 函数,对于其他查询,它可能会选择不调用任何工具,而是根据正在考虑的问题提供响应。

最后,工具调用及其输出将附加到对话中,最终答案由 Responses API 生成。

多工具编排流程

现在,让我们尝试修改输入查询和 Responses API 的系统指令,以遵循工具调用序列并生成输出。

# Process one query as an example to understand the tool calls and function calls as part of the response output
item = "What is the most common cause of death in the United States"

# Initialize input messages with the user's query.
input_messages = [{"role": "user", "content": item}]
print("\n🌟--- Processing Query ---🌟")
print(f"🔍 **User Query:** {item}")
    
    # Call the Responses API with tools enabled and allow parallel tool calls.
print("\n🔧 **Calling Responses API with Tools Enabled**")
print("\n🕵️‍♂️ **Step 1: Web Search Call**")
print("   - Initiating web search to gather initial information.")
print("\n📚 **Step 2: Pinecone Search Call**")
print("   - Querying Pinecone to find relevant examples from the internal knowledge base.")
    
response = client.responses.create(
        model="gpt-4o",
        input=[
            {"role": "system", "content": "Every time it's prompted with a question, first call the web search tool for results, then call `PineconeSearchDocuments` to find real examples in the internal knowledge base."},
            {"role": "user", "content": item}
        ],
        tools=tools,
        parallel_tool_calls=True
    )
    
# Print the initial response output.
print("input_messages", input_messages)

print("\n✨ **Initial Response Output:**")
print(response.output)
🌟--- Processing Query ---🌟
🔍 **User Query:** What is the most common cause of death in the United States

🔧 **Calling Responses API with Tools Enabled**

🕵️‍♂️ **Step 1: Web Search Call**
   - Initiating web search to gather initial information.

📚 **Step 2: Pinecone Search Call**
   - Querying Pinecone to find relevant examples from the internal knowledge base.
input_messages [{'role': 'user', 'content': 'What is the most common cause of death in the United States'}]

✨ **Initial Response Output:**
[ResponseFunctionWebSearch(id='ws_67e6e83241ac81918f93ffc96491ec390fdddafaeefcefc1', status='completed', type='web_search_call'), ResponseOutputMessage(id='msg_67e6e833a2cc8191a9df22f324a876b00fdddafaeefcefc1', content=[ResponseOutputText(annotations=[AnnotationURLCitation(end_index=698, start_index=613, title='Products - Data Briefs - Number 521 - December 2024', type='url_citation', url='https://www.cdc.gov/nchs/products/databriefs/db521.htm?utm_source=openai'), AnnotationURLCitation(end_index=984, start_index=891, title='US deaths are down and life expectancy is up, but improvements are slowing', type='url_citation', url='https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai'), AnnotationURLCitation(end_index=1186, start_index=1031, title='US deaths are down and life expectancy is up, but improvements are slowing', type='url_citation', url='https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai')], text="As of 2023, the leading causes of death in the United States are:\n\n1. **Heart Disease**: 680,981 deaths\n2. **Cancer**: 613,352 deaths\n3. **Unintentional Injuries**: 222,698 deaths\n4. **Stroke**: 162,639 deaths\n5. **Chronic Lower Respiratory Diseases**: 145,357 deaths\n6. **Alzheimer's Disease**: 114,034 deaths\n7. **Diabetes**: 95,190 deaths\n8. **Kidney Disease**: 55,253 deaths\n9. **Chronic Liver Disease and Cirrhosis**: 52,222 deaths\n10. **COVID-19**: 49,932 deaths\n\nNotably, COVID-19 has dropped from the fourth leading cause in 2022 to the tenth in 2023, reflecting a significant decrease in related deaths. ([cdc.gov](https://www.cdc.gov/nchs/products/databriefs/db521.htm?utm_source=openai))\n\nOverall, the U.S. experienced a decline in total deaths and a modest increase in life expectancy in 2023, attributed to reductions in deaths from COVID-19, heart disease, and drug overdoses. ([apnews.com](https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai))\n\n\n## Recent Trends in U.S. Mortality Rates:\n- [US deaths are down and life expectancy is up, but improvements are slowing](https://apnews.com/article/be061f9f14c883178eea6dddc9550e60?utm_source=openai) ", type='output_text')], role='assistant', status='completed', type='message'), ResponseFunctionToolCall(arguments='{"query":"most common cause of death in the United States","top_k":3}', call_id='call_6YWhEw3QSI7wGZBlNs5Pz4zI', name='PineconeSearchDocuments', type='function_call', id='fc_67e6e8364e4c819198501fba5d3f155b0fdddafaeefcefc1', status='completed')]
# Understand the tool calls and function calls as part of the response output

import pandas as pd

# Create a list to store the tool call and function call details
tool_calls = []

# Iterate through the response output and collect the details
for i in response.output:
    tool_calls.append({
        "Type": i.type,
        "Call ID": i.call_id if hasattr(i, 'call_id') else i.id if hasattr(i, 'id') else "N/A",
        "Output": str(i.output) if hasattr(i, 'output') else "N/A",
        "Name": i.name if hasattr(i, 'name') else "N/A"
    })

# Convert the list to a DataFrame for tabular display
df_tool_calls = pd.DataFrame(tool_calls)

# Display the DataFrame
df_tool_calls
类型 调用 ID 输出 名称
0 web_search_call ws_67e6e83241ac81918f93ffc96491ec390fdddafaeef... N/A N/A
1 消息 msg_67e6e833a2cc8191a9df22f324a876b00fdddafaee... N/A N/A
2 function_call call_6YWhEw3QSI7wGZBlNs5Pz4zI N/A PineconeSearchDocuments
tool_call_1 = response.output[0]
print(tool_call_1)
print(tool_call_1.id)

tool_call_2 = response.output[2]
print(tool_call_2)
print(tool_call_2.call_id)
ResponseFunctionWebSearch(id='ws_67e6e83241ac81918f93ffc96491ec390fdddafaeefcefc1', status='completed', type='web_search_call')
ws_67e6e83241ac81918f93ffc96491ec390fdddafaeefcefc1
ResponseFunctionToolCall(arguments='{"query":"most common cause of death in the United States","top_k":3}', call_id='call_6YWhEw3QSI7wGZBlNs5Pz4zI', name='PineconeSearchDocuments', type='function_call', id='fc_67e6e8364e4c819198501fba5d3f155b0fdddafaeefcefc1', status='completed')
call_6YWhEw3QSI7wGZBlNs5Pz4zI
# append the tool call and its output back into the conversation.
input_messages.append(response.output[2])
input_messages.append({
    "type": "function_call_output",
    "call_id": tool_call_2.call_id,
    "output": str(result)
})
print(input_messages)
[{'role': 'user', 'content': 'What is the most common cause of death in the United States'}, ResponseFunctionToolCall(arguments='{"query":"most common cause of death in the United States"}', call_id='call_8Vzsn4RwMOgXyX98UpZY8hls', name='PineconeSearchDocuments', type='function_call', id='fc_67e348f36f7c81919d0aeef1855df3f20d0bd7f2a5744b88', status='completed')]
[{'role': 'user', 'content': 'What is the most common cause of death in the United States'}, ResponseFunctionToolCall(arguments='{"query":"most common cause of death in the United States"}', call_id='call_8Vzsn4RwMOgXyX98UpZY8hls', name='PineconeSearchDocuments', type='function_call', id='fc_67e348f36f7c81919d0aeef1855df3f20d0bd7f2a5744b88', status='completed'), {'type': 'function_call_output', 'call_id': 'call_8Vzsn4RwMOgXyX98UpZY8hls', 'output': "**Question:** A 7-year-old boy with sickle cell disease is experiencing knee and hip pain, has been admitted for pain crises in the past, and now walks with a limp. His physical exam shows a normal and cool hip to the touch, with decreased range of motion at the hip and pain with ambulation. Given these findings, what is the most appropriate next step in the management of this patient's hip pain?\n**Answer:** In managing the hip pain of a 7-year-old boy with sickle cell disease, who presents with knee and hip pain, a limp, and decreased range of motion in the hip, the most appropriate next step is to obtain an X-ray of the hip. This will help evaluate the possibility of avascular necrosis (AVN) or other structural abnormalities. X-rays are typically the first-line imaging technique in such cases due to their accessibility and ability to reveal gross pathological changes. If the X-ray does not provide conclusive information and clinical suspicion of AVN remains high, an MRI may subsequently be considered for a more detailed assessment."}]

# Get the final answer incorporating the tool's result.
print("\n🔧 **Calling Responses API for Final Answer**")

response_2 = client.responses.create(
    model="gpt-4o",
    input=input_messages,
)
print(response_2)
🔧 **Calling Responses API for Final Answer**
Response(id='resp_67e6e886ac7081918b07224fb1ed38ab05c4a598f9697c7c', created_at=1743186054.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4o-2024-08-06', object='response', output=[ResponseOutputMessage(id='msg_67e6e8872ddc81918e92c9e4508abbe005c4a598f9697c7c', content=[ResponseOutputText(annotations=[], text='The most common cause of death in the United States is heart disease.', type='output_text')], role='assistant', status='completed', type='message')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[], top_p=1.0, max_output_tokens=None, previous_response_id=None, reasoning=Reasoning(effort=None, generate_summary=None), status='completed', text=ResponseTextConfig(format=ResponseFormatText(type='text')), truncation='disabled', usage=ResponseUsage(input_tokens=37, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=15, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=52), user=None, store=False)
# print the final answer
print(response_2.output_text)
The most common cause of death in the United States is heart disease.

在此,我们已经了解了如何利用 OpenAI 的 Responses API 来实现具有多工具调用功能的检索增强生成 (RAG) 方法。它展示了一个示例,其中模型根据输入查询选择适当的工具:一般问题可以通过内置工具(如 Web 搜索)来处理,而与内部知识相关的特定医学查询则通过从向量数据库(如 Pinecone)检索上下文并通过函数调用来解决。此外,我们还展示了如何按顺序组合多个工具调用,以根据我们提供给 Responses API 的指令生成最终响应。

当您继续试验并基于这些概念进行构建时,请考虑探索其他资源和示例,以进一步增强您的理解和应用。

祝您编码愉快!