Assistants API 概述 (Python SDK)

2023 年 11 月 10 日
在 Github 中打开

新的 Assistants API 是我们 Chat Completions API 的有状态演变,旨在简化类似助手体验的创建,并使开发者能够访问强大的工具,如代码解释器和文件搜索。

Assistants API Diagram

Chat Completions API 与 Assistants API

Chat Completions API 的原语是 Messages,您可以使用 Model (gpt-4o, gpt-4o-mini 等) 对其执行 Completion。它轻量且强大,但本质上是无状态的,这意味着您必须手动管理对话状态、工具定义、检索文档和代码执行。

Assistants API 的原语是

  • Assistants,它封装了基础模型、指令、工具和(上下文)文档,
  • Threads,它代表对话的状态,以及
  • Runs,它驱动 AssistantThread 上的执行,包括文本响应和多步工具使用。

我们将看看如何使用这些原语来创建强大的有状态体验。

设置

Python SDK

注意 我们已经更新了我们的 Python SDK 以添加对 Assistants API 的支持,因此您需要将其更新到最新版本(撰写本文时为 1.59.4)。

!pip install --upgrade openai
Requirement already satisfied: openai in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (1.59.4)
Requirement already satisfied: anyio<5,>=3.5.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (3.7.1)
Requirement already satisfied: distro<2,>=1.7.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (1.9.0)
Requirement already satisfied: httpx<1,>=0.23.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (0.27.0)
Requirement already satisfied: jiter<1,>=0.4.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (0.7.0)
Requirement already satisfied: pydantic<3,>=1.9.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (2.8.2)
Requirement already satisfied: sniffio in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (1.3.1)
Requirement already satisfied: tqdm>4 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (4.66.4)
Requirement already satisfied: typing-extensions<5,>=4.11 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from openai) (4.12.2)
Requirement already satisfied: idna>=2.8 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from anyio<5,>=3.5.0->openai) (3.7)
Requirement already satisfied: certifi in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from httpx<1,>=0.23.0->openai) (2024.7.4)
Requirement already satisfied: httpcore==1.* in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from httpx<1,>=0.23.0->openai) (1.0.5)
Requirement already satisfied: h11<0.15,>=0.13 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai) (0.14.0)
Requirement already satisfied: annotated-types>=0.4.0 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from pydantic<3,>=1.9.0->openai) (0.7.0)
Requirement already satisfied: pydantic-core==2.20.1 in /Users/lee.spacagna/myenv/lib/python3.12/site-packages (from pydantic<3,>=1.9.0->openai) (2.20.1)

并运行以下命令确保它是最新的

!pip show openai | grep Version
Version: 1.59.4
import json

def show_json(obj):
    display(json.loads(obj.model_dump_json()))

Assistants Playground

让我们从创建一个助手开始!我们将创建一个数学辅导助手,就像我们的 文档 中一样。

Creating New Assistant

您也可以直接通过 Assistants API 创建助手,如下所示

from openai import OpenAI
import os

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))


assistant = client.beta.assistants.create(
    name="Math Tutor",
    instructions="You are a personal math tutor. Answer questions briefly, in a sentence or less.",
    model="gpt-4o",
)
show_json(assistant)
{'id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'created_at': 1736340398,
 'description': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'metadata': {},
 'model': 'gpt-4o',
 'name': 'Math Tutor',
 'object': 'assistant',
 'tools': [],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': None, 'file_search': None},
 'top_p': 1.0} 'tools': [],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': None, 'file_search': None},
 'top_p': 1.0}

无论您是通过仪表板还是 API 创建助手,您都需要跟踪助手 ID。这是您在 Threads 和 Runs 中引用助手的方式。

接下来,我们将创建一个新的 Thread 并向其中添加一条 Message。这将保存我们对话的状态,因此我们不必每次都重新发送整个消息历史记录。

创建一个新线程

thread = client.beta.threads.create()
show_json(thread)
{'id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6',
 'created_at': 1736340398,
 'metadata': {},
 'object': 'thread',
 'tool_resources': {'code_interpreter': None, 'file_search': None}} 'object': 'thread',
 'tool_resources': {'code_interpreter': None, 'file_search': None}}

然后将消息添加到线程

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="I need to solve the equation `3x + 11 = 14`. Can you help me?",
)
show_json(message)
{'id': 'msg_1q4Y7ZZ9gIcPoAKSx9UtrrKJ',
 'assistant_id': None,
 'attachments': [],
 'completed_at': None,
 'attachments': [],
 'completed_at': None,
 'content': [{'text': {'annotations': [],
    'value': 'I need to solve the equation `3x + 11 = 14`. Can you help me?'},
   'type': 'text'}],
 'created_at': 1736340400,
 'incomplete_at': None,
 'incomplete_details': None,
 'metadata': {},
 'object': 'thread.message',
 'role': 'user',
 'run_id': None,
 'status': None,
 'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6'}

注意 即使您不再每次都发送整个历史记录,您仍然需要为每次 Run 的整个对话历史记录的 tokens 付费。

Runs

请注意,我们创建的 Thread 与我们之前创建的助手没有关联!Threads 独立于 Assistants 存在,这可能与您使用 ChatGPT 的期望不同(在 ChatGPT 中,线程与模型/GPT 绑定)。

要从助手为给定 Thread 获取补全,我们必须创建一个 Run。创建 Run 将指示助手应该查看 Thread 中的消息并采取行动:通过添加单个响应或使用工具。

注意 Runs 是 Assistants API 和 Chat Completions API 之间的关键区别。虽然在 Chat Completions 中,模型只会回复一条消息,但在 Assistants API 中,Run 可能会导致助手使用一个或多个工具,并可能向 Thread 添加多条消息。

为了让我们的助手响应用户,让我们创建 Run。如前所述,您必须同时指定助手和 Thread。

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
)
show_json(run)
{'id': 'run_qVYsWok6OCjHxkajpIrdHuVP',
 'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'cancelled_at': None,
 'completed_at': None,
 'created_at': 1736340403,
 'expires_at': 1736341003,
 'failed_at': None,
 'incomplete_details': None,
 'incomplete_details': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'last_error': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'metadata': {},
 'model': 'gpt-4o',
 'object': 'thread.run',
 'parallel_tool_calls': True,
 'parallel_tool_calls': True,
 'required_action': None,
 'response_format': 'auto',
 'response_format': 'auto',
 'started_at': None,
 'status': 'queued',
 'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6',
 'tool_choice': 'auto',
 'tools': [],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': None,
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}

与在 Chat Completions API 中创建补全不同,创建 Run 是一个异步操作。它将立即返回 Run 的元数据,其中包括最初设置为 queuedstatus。当助手执行操作(如使用工具和添加消息)时,status 将会更新。

为了知道助手何时完成处理,我们可以在循环中轮询 Run。(对流的支持即将推出!)虽然这里我们只检查 queuedin_progress 状态,但在实践中,Run 可能会经历 各种状态更改,您可以选择将其呈现给用户。(这些称为 Steps,将在稍后介绍。)

import time

def wait_on_run(run, thread):
    while run.status == "queued" or run.status == "in_progress":
        run = client.beta.threads.runs.retrieve(
            thread_id=thread.id,
            run_id=run.id,
        )
        time.sleep(0.5)
    return run
run = wait_on_run(run, thread)
show_json(run)
{'id': 'run_qVYsWok6OCjHxkajpIrdHuVP',
 'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'cancelled_at': None,
 'completed_at': 1736340406,
 'created_at': 1736340403,
 'expires_at': None,
 'failed_at': None,
 'incomplete_details': None,
 'incomplete_details': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'last_error': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'metadata': {},
 'model': 'gpt-4o',
 'object': 'thread.run',
 'parallel_tool_calls': True,
 'parallel_tool_calls': True,
 'required_action': None,
 'response_format': 'auto',
 'started_at': 1736340405,
 'status': 'completed',
 'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6',
 'tool_choice': 'auto',
 'tools': [],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': {'completion_tokens': 35,
  'prompt_tokens': 66,
  'total_tokens': 101,
  'prompt_token_details': {'cached_tokens': 0},
  'completion_tokens_details': {'reasoning_tokens': 0}},
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}

现在 Run 已完成,我们可以列出 Thread 中的 Messages,以查看助手添加了什么。

messages = client.beta.threads.messages.list(thread_id=thread.id)
show_json(messages)
{'data': [{'id': 'msg_A5eAN6ZAJDmFBOYutEm5DFCy',
   'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
   'attachments': [],
   'completed_at': None,
   'content': [{'text': {'annotations': [],
      'value': 'Sure! Subtract 11 from both sides to get \\(3x = 3\\), then divide by 3 to find \\(x = 1\\).'},
     'type': 'text'}],
   'created_at': 1736340405,
   'incomplete_at': None,
   'incomplete_details': None,
   'metadata': {},
   'object': 'thread.message',
   'role': 'assistant',
   'run_id': 'run_qVYsWok6OCjHxkajpIrdHuVP',
   'status': None,
   'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6'},
  {'id': 'msg_1q4Y7ZZ9gIcPoAKSx9UtrrKJ',
   'assistant_id': None,
   'attachments': [],
   'completed_at': None,
   'attachments': [],
   'completed_at': None,
   'content': [{'text': {'annotations': [],
      'value': 'I need to solve the equation `3x + 11 = 14`. Can you help me?'},
     'type': 'text'}],
   'created_at': 1736340400,
   'incomplete_at': None,
   'incomplete_details': None,
   'metadata': {},
   'object': 'thread.message',
   'role': 'user',
   'run_id': None,
   'status': None,
   'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6'}],
 'object': 'list',
 'first_id': 'msg_A5eAN6ZAJDmFBOYutEm5DFCy',
 'last_id': 'msg_1q4Y7ZZ9gIcPoAKSx9UtrrKJ',
 'has_more': False}

如您所见,Messages 按时间倒序排列——这样做是为了使最新的结果始终在第一个页面上(因为结果可以分页)。请留意这一点,因为这与 Chat Completions API 中的消息顺序相反。

让我们要求我们的助手进一步解释结果!

# Create a message to append to our thread
message = client.beta.threads.messages.create(
    thread_id=thread.id, role="user", content="Could you explain this to me?"
)

# Execute our run
run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
)

# Wait for completion
wait_on_run(run, thread)

# Retrieve all the messages added after our last user message
messages = client.beta.threads.messages.list(
    thread_id=thread.id, order="asc", after=message.id
)
show_json(messages)
{'data': [{'id': 'msg_wSHHvaMnaWktZWsKs6gyoPUB',
   'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
   'attachments': [],
   'completed_at': None,
   'content': [{'text': {'annotations': [],
      'value': 'Certainly! To isolate \\(x\\), first subtract 11 from both sides of the equation \\(3x + 11 = 14\\), resulting in \\(3x = 3\\). Then, divide both sides by 3 to solve for \\(x\\), giving you \\(x = 1\\).'},
     'type': 'text'}],
   'created_at': 1736340414,
   'incomplete_at': None,
   'incomplete_details': None,
   'metadata': {},
   'object': 'thread.message',
   'role': 'assistant',
   'run_id': 'run_lJsumsDtPTmdG3Enx2CfYrrq',
   'status': None,
   'thread_id': 'thread_j4dc1TiHPfkviKUHNi4aAsA6'}],
 'object': 'list',
 'first_id': 'msg_wSHHvaMnaWktZWsKs6gyoPUB',
 'last_id': 'msg_wSHHvaMnaWktZWsKs6gyoPUB',
 'has_more': False}

对于这个简单的例子,这可能感觉需要很多步骤才能获得回复。但是,您很快就会看到,我们如何在不更改太多代码的情况下向我们的助手添加非常强大的功能!

让我们看看如何将所有这些组合在一起。以下是使用您创建的助手所需的所有代码。

由于我们已经创建了数学助手,我已将其 ID 保存在 MATH_ASSISTANT_ID 中。然后我定义了两个函数

  • submit_message:在 Thread 上创建一条 Message,然后启动(并返回)一个新的 Run
  • get_response:返回 Thread 中的 Messages 列表
from openai import OpenAI

MATH_ASSISTANT_ID = assistant.id  # or a hard-coded ID like "asst-..."

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

def submit_message(assistant_id, thread, user_message):
    client.beta.threads.messages.create(
        thread_id=thread.id, role="user", content=user_message
    )
    return client.beta.threads.runs.create(
        thread_id=thread.id,
        assistant_id=assistant_id,
    )


def get_response(thread):
    return client.beta.threads.messages.list(thread_id=thread.id, order="asc")

我还定义了一个 create_thread_and_run 函数,我可以重复使用它(这实际上与我们的 API 中的 client.beta.threads.create_and_run 复合函数几乎相同 ;) )。最后,我们可以将我们的模拟用户请求分别提交到新的 Thread。

请注意,所有这些 API 调用都是异步操作;这意味着我们在代码中实际获得了异步行为,而无需使用异步库!(例如 asyncio

def create_thread_and_run(user_input):
    thread = client.beta.threads.create()
    run = submit_message(MATH_ASSISTANT_ID, thread, user_input)
    return thread, run


# Emulating concurrent user requests
thread1, run1 = create_thread_and_run(
    "I need to solve the equation `3x + 11 = 14`. Can you help me?"
)
thread2, run2 = create_thread_and_run("Could you explain linear algebra to me?")
thread3, run3 = create_thread_and_run("I don't like math. What can I do?")

# Now all Runs are executing...

一旦所有 Runs 开始运行,我们可以等待每个 Run 完成并获取响应。

import time

# Pretty printing helper
def pretty_print(messages):
    print("# Messages")
    for m in messages:
        print(f"{m.role}: {m.content[0].text.value}")
    print()


# Waiting in a loop
def wait_on_run(run, thread):
    while run.status == "queued" or run.status == "in_progress":
        run = client.beta.threads.runs.retrieve(
            thread_id=thread.id,
            run_id=run.id,
        )
        time.sleep(0.5)
    return run


# Wait for Run 1
run1 = wait_on_run(run1, thread1)
pretty_print(get_response(thread1))

# Wait for Run 2
run2 = wait_on_run(run2, thread2)
pretty_print(get_response(thread2))

# Wait for Run 3
run3 = wait_on_run(run3, thread3)
pretty_print(get_response(thread3))

# Thank our assistant on Thread 3 :)
run4 = submit_message(MATH_ASSISTANT_ID, thread3, "Thank you!")
run4 = wait_on_run(run4, thread3)
pretty_print(get_response(thread3))
# Messages
user: I need to solve the equation `3x + 11 = 14`. Can you help me?
assistant: Sure! Subtract 11 from both sides to get \(3x = 3\), then divide by 3 to find \(x = 1\).

# Messages
user: Could you explain linear algebra to me?
assistant: Linear algebra is the branch of mathematics concerning vector spaces, linear transformations, and systems of linear equations, often represented with matrices.

# Messages
user: I don't like math. What can I do?
assistant: Try relating math to real-life interests or hobbies, practice with fun games or apps, and gradually build confidence with easier problems.

# Messages
user: I don't like math. What can I do?
assistant: Try relating math to real-life interests or hobbies, practice with fun games or apps, and gradually build confidence with easier problems.
user: Thank you!
assistant: You're welcome! If you have any more questions, feel free to ask!

瞧!

您可能已经注意到,此代码实际上并非特定于我们的数学助手……只需更改助手 ID,此代码即可用于您创建的任何新助手!这就是 Assistants API 的强大之处。

工具

Assistants API 的一个关键功能是能够为我们的助手配备工具,如代码解释器、文件搜索和自定义函数。让我们分别看一下每个工具。

代码解释器

让我们为我们的数学辅导助手配备 代码解释器 工具,我们可以从仪表板执行此操作……

Enabling code interpreter

……或使用助手 ID 通过 API 执行此操作。

assistant = client.beta.assistants.update(
    MATH_ASSISTANT_ID,
    tools=[{"type": "code_interpreter"}],
)
show_json(assistant)
{'id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'created_at': 1736340398,
 'description': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'metadata': {},
 'model': 'gpt-4o',
 'name': 'Math Tutor',
 'object': 'assistant',
 'tools': [{'type': 'code_interpreter'}],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': {'file_ids': []}, 'file_search': None},
 'top_p': 1.0} 'tools': [{'type': 'code_interpreter'}],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': {'file_ids': []}, 'file_search': None},
 'top_p': 1.0}

现在,让我们要求助手使用其新工具。

thread, run = create_thread_and_run(
    "Generate the first 20 fibbonaci numbers with code."
)
run = wait_on_run(run, thread)
pretty_print(get_response(thread))
# Messages
user: Generate the first 20 fibbonaci numbers with code.
assistant: The first 20 Fibonacci numbers are: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181.

就是这样!助手在后台使用了代码解释器,并给了我们最终响应。

对于某些用例,这可能就足够了——但是,如果我们想要更详细地了解助手正在做什么,我们可以查看 Run 的 Steps。

Steps

Run 由一个或多个 Steps 组成。与 Run 类似,每个 Step 都有一个 status,您可以查询该状态。这对于向用户呈现 Step 的进度非常有用(例如,在助手编写代码或执行检索时显示微调器)。

run_steps = client.beta.threads.runs.steps.list(
    thread_id=thread.id, run_id=run.id, order="asc"
)

让我们看一下每个 Step 的 step_details

for step in run_steps.data:
    step_details = step.step_details
    print(json.dumps(show_json(step_details), indent=4))
{'tool_calls': [{'id': 'call_E1EE1loDmcWoc7FpkOMKYj6n',
   'code_interpreter': {'input': 'def generate_fibonacci(n):\n    fib_sequence = [0, 1]\n    while len(fib_sequence) < n:\n        next_value = fib_sequence[-1] + fib_sequence[-2]\n        fib_sequence.append(next_value)\n    return fib_sequence\n\n# Generate the first 20 Fibonacci numbers\nfirst_20_fibonacci = generate_fibonacci(20)\nfirst_20_fibonacci',
    'outputs': []},
   'type': 'code_interpreter'}],
 'type': 'tool_calls'}
null
{'message_creation': {'message_id': 'msg_RzTnbBMmzDYHk79a0x9qM5uU'},
 'type': 'message_creation'}
null

我们可以看到两个 Steps 的 step_details

  1. tool_calls(复数,因为一个 Step 中可能不止一个)
  2. message_creation

第一个 Step 是 tool_calls,特别是使用包含以下内容的 code_interpreter

  • input,这是在调用工具之前生成的 Python 代码,以及
  • output,这是运行代码解释器的结果。

第二个 Step 是 message_creation,其中包含添加到 Thread 的 message,以向用户传达结果。

Assistants API 中的另一个强大工具是 文件搜索。这允许将文件上传到助手,以便在回答问题时用作知识库。

Enabling retrieval

# Upload the file
file = client.files.create(
    file=open(
        "data/language_models_are_unsupervised_multitask_learners.pdf",
        "rb",
    ),
    purpose="assistants",
)

# Create a vector store
vector_store = client.beta.vector_stores.create(
    name="language_models_are_unsupervised_multitask_learners",
)

# Add the file to the vector store
vector_store_file = client.beta.vector_stores.files.create_and_poll(
    vector_store_id=vector_store.id,
    file_id=file.id,
)

# Confirm the file was added
while vector_store_file.status == "in_progress":
    time.sleep(1)
if vector_store_file.status == "completed":
    print("File added to vector store")
elif vector_store_file.status == "failed":
    raise Exception("Failed to add file to vector store")

# Update Assistant
assistant = client.beta.assistants.update(
    MATH_ASSISTANT_ID,
    tools=[{"type": "code_interpreter"}, {"type": "file_search"}],
    tool_resources={
        "file_search":{
            "vector_store_ids": [vector_store.id]
        },
        "code_interpreter": {
            "file_ids": [file.id]
        }
    },
)
show_json(assistant)
File added to vector store
{'id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'created_at': 1736340398,
 'description': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'metadata': {},
 'model': 'gpt-4o',
 'name': 'Math Tutor',
 'object': 'assistant',
 'tools': [{'type': 'code_interpreter'},
  {'type': 'file_search',
   'file_search': {'max_num_results': None,
    'ranking_options': {'score_threshold': 0.0,
     'ranker': 'default_2024_08_21'}}}],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': {'file_ids': ['file-GQFm2i7N8LrAQatefWKEsE']},
  'file_search': {'vector_store_ids': ['vs_dEArILZSJh7J799QACi3QhuU']}},
 'top_p': 1.0}
thread, run = create_thread_and_run(
    "What are some cool math concepts behind this ML paper pdf? Explain in two sentences."
)
run = wait_on_run(run, thread)
pretty_print(get_response(thread))
# Messages
user: What are some cool math concepts behind this ML paper pdf? Explain in two sentences.
assistant: The paper explores the concept of multitask learning where a single model is used to perform various tasks, modeling the conditional distribution \( p(\text{output} | \text{input, task}) \), inspired by probabilistic approaches【6:10†source】. It also discusses the use of Transformer-based architectures and parallel corpus substitution in language models, enhancing their ability to generalize across domain tasks without explicit task-specific supervision【6:2†source】【6:5†source】.

注意 文件搜索中还有更多复杂性,例如 注释,这可能会在另一个 cookbook 中介绍。

# Delete the vector store
client.beta.vector_stores.delete(vector_store.id)
VectorStoreDeleted(id='vs_dEArILZSJh7J799QACi3QhuU', deleted=True, object='vector_store.deleted')

函数

作为助手的最后一个强大工具,您可以指定自定义 函数(很像 Chat Completions API 中的 函数调用)。在 Run 期间,助手可以指示它想要调用您指定的一个或多个函数。然后,您负责调用函数,并将输出提供回助手。

让我们通过为我们的数学辅导助手定义一个 display_quiz() 函数来看一个例子。

此函数将接收 titlequestion 数组,显示测验,并获取用户对每个问题的输入

  • title
  • questions
    • question_text
    • question_type:[MULTIPLE_CHOICE, FREE_RESPONSE]
    • choices:["选项 1", "选项 2", ...]

我将使用 get_mock_response... 模拟响应。在这里您将获得用户的实际输入。

def get_mock_response_from_user_multiple_choice():
    return "a"


def get_mock_response_from_user_free_response():
    return "I don't know."


def display_quiz(title, questions):
    print("Quiz:", title)
    print()
    responses = []

    for q in questions:
        print(q["question_text"])
        response = ""

        # If multiple choice, print options
        if q["question_type"] == "MULTIPLE_CHOICE":
            for i, choice in enumerate(q["choices"]):
                print(f"{i}. {choice}")
            response = get_mock_response_from_user_multiple_choice()

        # Otherwise, just get response
        elif q["question_type"] == "FREE_RESPONSE":
            response = get_mock_response_from_user_free_response()

        responses.append(response)
        print()

    return responses

以下是示例测验的外观

responses = display_quiz(
    "Sample Quiz",
    [
        {"question_text": "What is your name?", "question_type": "FREE_RESPONSE"},
        {
            "question_text": "What is your favorite color?",
            "question_type": "MULTIPLE_CHOICE",
            "choices": ["Red", "Blue", "Green", "Yellow"],
        },
    ],
)
print("Responses:", responses)
Quiz: Sample Quiz

What is your name?

What is your favorite color?
0. Red
1. Blue
2. Green
3. Yellow

Responses: ["I don't know.", 'a']

现在,让我们以 JSON 格式定义此函数的接口,以便我们的助手可以调用它

function_json = {
    "name": "display_quiz",
    "description": "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    "parameters": {
        "type": "object",
        "properties": {
            "title": {"type": "string"},
            "questions": {
                "type": "array",
                "description": "An array of questions, each with a title and potentially options (if multiple choice).",
                "items": {
                    "type": "object",
                    "properties": {
                        "question_text": {"type": "string"},
                        "question_type": {
                            "type": "string",
                            "enum": ["MULTIPLE_CHOICE", "FREE_RESPONSE"]
                        },
                        "choices": {"type": "array", "items": {"type": "string"}}
                    },
                    "required": ["question_text"]
                }
            }
        },
        "required": ["title", "questions"]
    }
}

再次,让我们通过仪表板或 API 更新我们的助手。

Enabling custom function

注意 由于缩进等原因,将函数 JSON 粘贴到仪表板中有点棘手。我只是让 ChatGPT 按照仪表板上的示例之一格式化我的函数 :)。

assistant = client.beta.assistants.update(
    MATH_ASSISTANT_ID,
    tools=[
        {"type": "code_interpreter"},
        {"type": "file_search"},
        {"type": "function", "function": function_json},
    ],
)
show_json(assistant)
{'id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'created_at': 1736340398,
 'description': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'metadata': {},
 'model': 'gpt-4o',
 'name': 'Math Tutor',
 'object': 'assistant',
 'tools': [{'type': 'code_interpreter'},
  {'type': 'file_search',
   'file_search': {'max_num_results': None,
    'ranking_options': {'score_threshold': 0.0,
     'ranker': 'default_2024_08_21'}}},
  {'function': {'name': 'display_quiz',
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'parameters': {'type': 'object',
     'properties': {'title': {'type': 'string'},
      'questions': {'type': 'array',
       'description': 'An array of questions, each with a title and potentially options (if multiple choice).',
       'items': {'type': 'object',
        'properties': {'question_text': {'type': 'string'},
         'question_type': {'type': 'string',
          'enum': ['MULTIPLE_CHOICE', 'FREE_RESPONSE']},
         'choices': {'type': 'array', 'items': {'type': 'string'}}},
        'required': ['question_text']}}},
     'required': ['title', 'questions']},
    'strict': False},
   'type': 'function'}],
 'response_format': 'auto',
 'temperature': 1.0,
 'tool_resources': {'code_interpreter': {'file_ids': ['file-GQFm2i7N8LrAQatefWKEsE']},
  'file_search': {'vector_store_ids': []}},
 'top_p': 1.0}

现在,我们要求进行测验。

thread, run = create_thread_and_run(
    "Make a quiz with 2 questions: One open ended, one multiple choice. Then, give me feedback for the responses."
)
run = wait_on_run(run, thread)
run.status
'requires_action'

但是,现在当我们检查 Run 的 status 时,我们看到 requires_action!让我们仔细看看。

show_json(run)
{'id': 'run_ekMRSI2h35asEzKirRf4BTwZ',
 'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'cancelled_at': None,
 'completed_at': None,
 'created_at': 1736341020,
 'expires_at': 1736341620,
 'failed_at': None,
 'incomplete_details': None,
 'incomplete_details': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'last_error': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'metadata': {},
 'model': 'gpt-4o',
 'object': 'thread.run',
 'parallel_tool_calls': True,
 'required_action': {'submit_tool_outputs': {'tool_calls': [{'id': 'call_uvJEn0fxM4sgmzek8wahBGLi',
     'function': {'arguments': '{"title":"Math Quiz","questions":[{"question_text":"What is the derivative of the function f(x) = 3x^2 + 2x - 5?","question_type":"FREE_RESPONSE"},{"question_text":"What is the value of \\\\( \\\\int_{0}^{1} 2x \\\\, dx \\\\)?","question_type":"MULTIPLE_CHOICE","choices":["0","1","2","3"]}]}',
      'name': 'display_quiz'},
     'type': 'function'}]},
  'type': 'submit_tool_outputs'},
 'response_format': 'auto',
 'started_at': 1736341022,
 'status': 'requires_action',
 'thread_id': 'thread_8bK2PXfoeijEHBVEzYuJXt17',
 'tool_choice': 'auto',
 'tools': [{'type': 'code_interpreter'},
  {'type': 'file_search',
   'file_search': {'max_num_results': None,
    'ranking_options': {'score_threshold': 0.0,
     'ranker': 'default_2024_08_21'}}},
  {'function': {'name': 'display_quiz',
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'parameters': {'type': 'object',
     'properties': {'title': {'type': 'string'},
      'questions': {'type': 'array',
       'description': 'An array of questions, each with a title and potentially options (if multiple choice).',
       'items': {'type': 'object',
        'properties': {'question_text': {'type': 'string'},
         'question_type': {'type': 'string',
          'enum': ['MULTIPLE_CHOICE', 'FREE_RESPONSE']},
         'choices': {'type': 'array', 'items': {'type': 'string'}}},
        'required': ['question_text']}}},
     'required': ['title', 'questions']},
    'strict': False},
   'type': 'function'}],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': None,
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}    'strict': False},
   'type': 'function'}],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': None,
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}

required_action 字段指示一个 Tool 正在等待我们运行它并将它的输出提交回助手。具体来说,是 display_quiz 函数!让我们首先解析 namearguments

注意 虽然在本例中我们知道只有一个 Tool 调用,但在实践中,助手可能会选择调用多个工具。

# Extract single tool call
tool_call = run.required_action.submit_tool_outputs.tool_calls[0]
name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)

print("Function Name:", name)
print("Function Arguments:")
arguments
Function Name: display_quiz
Function Arguments:
{'title': 'Math Quiz',
 'questions': [{'question_text': 'What is the derivative of the function f(x) = 3x^2 + 2x - 5?',
   'question_type': 'FREE_RESPONSE'},
  {'question_text': 'What is the value of \\( \\int_{0}^{1} 2x \\, dx \\)?',
   'question_type': 'MULTIPLE_CHOICE',
   'choices': ['0', '1', '2', '3']}]}

现在让我们实际使用助手提供的参数调用我们的 display_quiz 函数

responses = display_quiz(arguments["title"], arguments["questions"])
print("Responses:", responses)
Quiz: Math Quiz
Quiz: Math Quiz

What is the derivative of the function f(x) = 3x^2 + 2x - 5?

What is the value of \( \int_{0}^{1} 2x \, dx \)?
0. 0
1. 1
2. 2
3. 3

Responses: ["I don't know.", 'a']

太棒了!(请记住,这些响应是我们之前模拟的响应。实际上,我们将从这个函数调用中获得用户的输入。)

现在我们有了响应,让我们将它们提交回助手。我们将需要 tool_call ID,它在之前解析出的 tool_call 中找到。我们还需要将我们的响应 list 编码为 str

run = client.beta.threads.runs.submit_tool_outputs(
    thread_id=thread.id,
    run_id=run.id,
    tool_outputs=tool_outputs
)
show_json(run)
{'id': 'run_ekMRSI2h35asEzKirRf4BTwZ',
 'assistant_id': 'asst_qvXmYlZV8zhABI2RtPzDfV6z',
 'cancelled_at': None,
 'completed_at': None,
 'created_at': 1736341020,
 'expires_at': 1736341620,
 'failed_at': None,
 'incomplete_details': None,
 'incomplete_details': None,
 'instructions': 'You are a personal math tutor. Answer questions briefly, in a sentence or less.',
 'last_error': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'max_completion_tokens': None,
 'max_prompt_tokens': None,
 'metadata': {},
 'model': 'gpt-4o',
 'object': 'thread.run',
 'parallel_tool_calls': True,
 'parallel_tool_calls': True,
 'required_action': None,
 'response_format': 'auto',
 'started_at': 1736341022,
 'status': 'queued',
 'thread_id': 'thread_8bK2PXfoeijEHBVEzYuJXt17',
 'tool_choice': 'auto',
 'tools': [{'type': 'code_interpreter'},
  {'type': 'file_search',
   'file_search': {'max_num_results': None,
    'ranking_options': {'score_threshold': 0.0,
     'ranker': 'default_2024_08_21'}}},
  {'function': {'name': 'display_quiz',
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'description': "Displays a quiz to the student, and returns the student's response. A single quiz can have multiple questions.",
    'parameters': {'type': 'object',
     'properties': {'title': {'type': 'string'},
      'questions': {'type': 'array',
       'description': 'An array of questions, each with a title and potentially options (if multiple choice).',
       'items': {'type': 'object',
        'properties': {'question_text': {'type': 'string'},
         'question_type': {'type': 'string',
          'enum': ['MULTIPLE_CHOICE', 'FREE_RESPONSE']},
         'choices': {'type': 'array', 'items': {'type': 'string'}}},
        'required': ['question_text']}}},
     'required': ['title', 'questions']},
    'strict': False},
   'type': 'function'}],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': None,
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}    'strict': False},
   'type': 'function'}],
 'truncation_strategy': {'type': 'auto', 'last_messages': None},
 'usage': None,
 'temperature': 1.0,
 'top_p': 1.0,
 'tool_resources': {}}

我们现在可以再次等待 Run 完成,并检查我们的 Thread!

run = wait_on_run(run, thread)
pretty_print(get_response(thread))
# Messages
user: Make a quiz with 2 questions: One open ended, one multiple choice. Then, give me feedback for the responses.
assistant: Since no specific information was found in the uploaded file, I'll create a general math quiz for you:

1. **Open-ended Question**: What is the derivative of the function \( f(x) = 3x^2 + 2x - 5 \)?

2. **Multiple Choice Question**: What is the value of \( \int_{0}^{1} 2x \, dx \)?
    - A) 0
    - B) 1
    - C) 2
    - D) 3

I will now present the quiz to you for response.
assistant: Here is the feedback for your responses:

1. **Derivative Question**: 
   - Your Response: "I don't know."
   - Feedback: The derivative of \( f(x) = 3x^2 + 2x - 5 \) is \( f'(x) = 6x + 2 \).

2. **Integration Question**: 
   - Your Response: A) 0
   - Feedback: The correct answer is B) 1. The integration \(\int_{0}^{1} 2x \, dx \) evaluates to 1.

哇哦 🎉

结论

我们在本笔记本中涵盖了很多内容,给自己一个 High-Five!希望您现在应该拥有强大的基础,可以使用代码解释器、检索和函数等工具构建强大的有状态体验!

为了简洁起见,我们没有介绍一些章节,因此这里有一些资源供您进一步探索

  • 注释:解析文件引用
  • 文件:线程作用域与助手作用域
  • 并行函数调用:在单个 Step 中调用多个工具
  • 多助手线程 Runs:来自多个助手的消息的单个 Thread
  • 流式传输:即将推出!

现在去构建一些令人惊的东西吧!