o1 推理模型的初始发布版本(2024 年 9 月)具有高级功能,但不支持结构化输出。
这意味着使用 o1 的请求不具备可靠的类型安全,并且依赖于提示本身来返回有用的 JSON。
在本指南中,我们将探讨两种提示 o1 模型(特别是 o1-preview
)的方法,以便在使用 OpenAI API 时返回有效的 JSON 格式。
o1 推理模型的初始发布版本(2024 年 9 月)具有高级功能,但不支持结构化输出。
这意味着使用 o1 的请求不具备可靠的类型安全,并且依赖于提示本身来返回有用的 JSON。
在本指南中,我们将探讨两种提示 o1 模型(特别是 o1-preview
)的方法,以便在使用 OpenAI API 时返回有效的 JSON 格式。
使用 o1-preview
返回 JSON 响应的最简单方法是显式地提示它。
让我们来看一个例子:
import requests
from openai import OpenAI
client = OpenAI()
def fetch_html(url):
response = requests.get(url)
if response.status_code == 200:
return response.text
else:
return None
url = "https://en.wikipedia.org/wiki/List_of_largest_companies_in_the_United_States_by_revenue"
html_content = fetch_html(url)
json_format = """
{
companies: [
{
\"company_name\": \"OpenAI\",
\"page_link\": \"https://en.wikipedia.org/wiki/OpenAI\",
\"reason\": \"OpenAI would benefit because they are an AI company...\"
}
]
}
"""
o1_response = client.chat.completions.create(
model="o1-preview",
messages=[
{
"role": "user",
"content": f"""
You are a business analyst designed to understand how AI technology could be used across large corporations.
- Read the following html and return which companies would benefit from using AI technology: {html_content}.
- Rank these propects by opportunity by comparing them and show me the top 3. Return only as a JSON with the following format: {json_format}"
"""
}
]
)
print(o1_response.choices[0].message.content)
{ "companies": [ { "company_name": "Walmart", "page_link": "https://en.wikipedia.org/wiki/Walmart", "reason": "Walmart could benefit from AI technology by enhancing their supply chain management, optimizing inventory levels, improving customer service through AI-powered chatbots, and providing personalized shopping experiences. AI can help Walmart forecast demand more accurately, reduce operational costs, and increase overall efficiency." }, { "company_name": "UnitedHealth Group", "page_link": "https://en.wikipedia.org/wiki/UnitedHealth_Group", "reason": "UnitedHealth Group could leverage AI technology to improve patient care through predictive analytics, personalize treatment plans, detect fraudulent claims, and streamline administrative processes. AI can assist in early disease detection, improve diagnostic accuracy, and enhance data analysis for better health outcomes." }, { "company_name": "Ford Motor Company", "page_link": "https://en.wikipedia.org/wiki/Ford_Motor_Company", "reason": "Ford Motor Company could benefit from AI technology by advancing autonomous vehicle development, optimizing manufacturing processes with automation and robotics, implementing predictive maintenance, and enhancing the in-car experience with AI-driven features. AI can help Ford improve safety, reduce production costs, and innovate new transportation solutions." } ] }
请注意,响应已经相当不错 - 它返回了包含适当响应的 JSON。但是,它遇到了与现有仅提示 JSON 推理用例相同的陷阱
现在让我们使用结构化输出来完成此操作。为了启用此功能,我们将 o1-preview
响应与对 gpt-4o-mini
的后续请求链接起来,后者可以有效地处理从初始 o1-preview 响应返回的数据。
from pydantic import BaseModel
from devtools import pprint
class CompanyData(BaseModel):
company_name: str
page_link: str
reason: str
class CompaniesData(BaseModel):
companies: list[CompanyData]
o1_response = client.chat.completions.create(
model="o1-preview",
messages=[
{
"role": "user",
"content": f"""
You are a business analyst designed to understand how AI technology could be used across large corporations.
- Read the following html and return which companies would benefit from using AI technology: {html_content}.
- Rank these propects by opportunity by comparing them and show me the top 3. Return each with {CompanyData.__fields__.keys()}
"""
}
]
)
o1_response_content = o1_response.choices[0].message.content
response = client.beta.chat.completions.parse(
model="gpt-4o-mini",
messages=[
{
"role": "user",
"content": f"""
Given the following data, format it with the given response format: {o1_response_content}
"""
}
],
response_format=CompaniesData,
)
pprint(response.choices[0].message.parsed)
CompaniesData( companies=[ CompanyData( company_name='Walmart', page_link='https://en.wikipedia.org/wiki/Walmart', reason=( 'As the largest retailer, Walmart can significantly benefit from AI by optimizing supply chain and inv' 'entory management, improving demand forecasting, personalizing customer experiences, and enhancing in' '-store operations through AI-driven analytics.' ), ), CompanyData( company_name='JPMorgan Chase', page_link='https://en.wikipedia.org/wiki/JPMorgan_Chase', reason=( 'As a leading financial institution, JPMorgan Chase can leverage AI for fraud detection, risk manageme' 'nt, personalized banking services, algorithmic trading, and enhancing customer service with AI-powere' 'd chatbots and virtual assistants.' ), ), CompanyData( company_name='UnitedHealth Group', page_link='https://en.wikipedia.org/wiki/UnitedHealth_Group', reason=( 'Being a major player in healthcare, UnitedHealth Group can utilize AI to improve patient care through' ' predictive analytics, enhance diagnostics, streamline administrative processes, and reduce costs by ' 'optimizing operations with AI-driven solutions.' ), ), ], )
结构化输出允许您的代码具有可靠的类型安全性和更简单的提示。此外,它允许您重用对象模式,以便更轻松地集成到您现有的工作流程中。
o1 类型的模型目前不支持结构化输出,但我们可以通过将两个请求链接在一起来重用 gpt-4o-mini
中现有的结构化输出功能。此流程目前需要两次调用,但与 o1-preview
/o1-mini
调用相比,第二次 gpt-4o-mini
调用的成本应该很小。