如何使用 DALL·E API | OpenAI Cookbook

此 notebook 展示了如何使用 OpenAI 的 DALL·E 图像 API 终端。

有三个 API 终端

生成 (Generations): 基于输入的描述文字生成一张或多张图像
编辑 (Edits): 编辑或扩展现有图像
变体 (Variations): 生成输入图像的变体

设置

导入你需要的包
导入你的 OpenAI API 密钥：你可以通过在终端运行 `export OPENAI_API_KEY="你的 API 密钥"` 来完成此操作。
设置一个目录来保存图像

# imports from openai import OpenAI # OpenAI Python library to make API calls import requests # used to download images import os # used to access filepaths from PIL import Image # used to print and edit images # initialize OpenAI client client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

# set a directory to save DALL·E images to image_dir_name = "images" image_dir = os.path.join(os.curdir, image_dir_name) # create the directory if it doesn't yet exist if not os.path.isdir(image_dir): os.mkdir(image_dir) # print the directory to save to print(f"{image_dir=}")

生成 (Generations)

生成 API 终端根据文本提示创建图像。API 参考

必需的输入

prompt (str): 所需图像的文本描述。对于 dall-e-2，最大长度为 1000 个字符，对于 dall-e-3，最大长度为 4000 个字符。

可选输入

model (str): 用于图像生成的模型。默认为 dall-e-2
n (int): 要生成的图像数量。必须介于 1 和 10 之间。默认为 1。
quality (str): 将生成的图像质量。hd 创建具有更精细细节和图像之间更高一致性的图像。此参数仅支持 dall-e-3。
response_format (str): 返回生成图像的格式。必须是 “url” 或 “b64_json” 之一。默认为 “url”。
size (str): 生成图像的尺寸。对于 dall-e-2，必须是 256x256、512x512 或 1024x1024 之一。对于 dall-e-3 模型，必须是 1024x1024、1792x1024 或 1024x1792 之一。默认为 “1024x1024”。
style(str | null): 生成图像的风格。必须是 vivid 或 natural 之一。Vivid 使模型倾向于生成超现实和戏剧性的图像。Natural 使模型生成更自然、不太超现实的图像。此参数仅支持 dall-e-3。
user (str): 代表您的最终用户的唯一标识符，这将帮助 OpenAI 监控和检测滥用行为。了解更多。

# create an image # set the prompt prompt = "A cyberpunk monkey hacker dreaming of a beautiful bunch of bananas, digital art" # call the OpenAI API generation_response = client.images.generate( model = "dall-e-3", prompt=prompt, n=1, size="1024x1024", response_format="url", ) # print response print(generation_response)

ImagesResponse(created=1701994117, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-ced13hkOk3lXkccQgW1fAQjm.png?st=2023-12-07T23%3A08%3A37Z&se=2023-12-08T01%3A08%3A37Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A41%3A48Z&ske=2023-12-08T16%3A41%3A48Z&sks=b&skv=2021-08-06&sig=tcD0iyU0ABOvWAKsY89gp5hLVIYkoSXQnrcmH%2Brkric%3D')])

# save the image generated_image_name = "generated_image.png" # any name you like; the filetype should be .png generated_image_filepath = os.path.join(image_dir, generated_image_name) generated_image_url = generation_response.data[0].url # extract image URL from response generated_image = requests.get(generated_image_url).content # download the image with open(generated_image_filepath, "wb") as image_file: image_file.write(generated_image) # write the image to the file

变体 (Variations)

变体终端生成与输入图像类似的新图像（变体）。API 参考

在这里，我们将生成上面生成的图像的变体。

必需的输入

image (str): 用作变体基础的图像。必须是有效的 PNG 文件，小于 4MB，并且是正方形。

可选输入

model (str): 用于图像变体的模型。目前仅支持 dall-e-2。
n (int): 要生成的图像数量。必须介于 1 和 10 之间。默认为 1。
size (str): 生成图像的尺寸。必须是 “256x256”、“512x512” 或 “1024x1024” 之一。较小的图像速度更快。默认为 “1024x1024”。
response_format (str): 返回生成图像的格式。必须是 “url” 或 “b64_json” 之一。默认为 “url”。
user (str): 代表您的最终用户的唯一标识符，这将帮助 OpenAI 监控和检测滥用行为。了解更多。

# create variations # call the OpenAI API, using `create_variation` rather than `create` variation_response = client.images.create_variation( image=generated_image, # generated_image is the image generated above n=2, size="1024x1024", response_format="url", ) # print response print(variation_response)

ImagesResponse(created=1701994139, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-noNRGgwaaotRGIe6Y2GVeSpr.png?st=2023-12-07T23%3A08%3A59Z&se=2023-12-08T01%3A08%3A59Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A39%3A11Z&ske=2023-12-08T16%3A39%3A11Z&sks=b&skv=2021-08-06&sig=ER5RUglhtIk9LWJXw1DsolorT4bnEmFostfnUjY21ns%3D'), Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-oz952tL11FFhf9iXXJVIRUZX.png?st=2023-12-07T23%3A08%3A59Z&se=2023-12-08T01%3A08%3A59Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A39%3A11Z&ske=2023-12-08T16%3A39%3A11Z&sks=b&skv=2021-08-06&sig=99rJOQwDKsfIeerlMXMHholhAhrHfYaQRFJBF8FKv74%3D')])

# save the images variation_urls = [datum.url for datum in variation_response.data] # extract URLs variation_images = [requests.get(url).content for url in variation_urls] # download images variation_image_names = [f"variation_image_{i}.png" for i in range(len(variation_images))] # create names variation_image_filepaths = [os.path.join(image_dir, name) for name in variation_image_names] # create filepaths for image, filepath in zip(variation_images, variation_image_filepaths): # loop through the variations with open(filepath, "wb") as image_file: # open the file image_file.write(image) # write the image to the file

# print the original image print(generated_image_filepath) display(Image.open(generated_image_filepath)) # print the new variations for variation_image_filepaths in variation_image_filepaths: print(variation_image_filepaths) display(Image.open(variation_image_filepaths))

编辑 (Edits)

编辑终端使用 DALL·E 生成现有图像的指定部分。需要三个输入：要编辑的图像、指定要重新生成部分的蒙版以及描述所需图像的提示。API 参考

必需的输入

image (str): 要编辑的图像。必须是有效的 PNG 文件，小于 4MB，并且是正方形。如果未提供蒙版，则图像必须具有透明度，这将用作蒙版。
prompt (str): 所需图像的文本描述。最大长度为 1000 个字符。

可选输入

mask (file): 附加图像，其完全透明的区域（例如，alpha 为零的位置）指示应编辑图像的位置。必须是有效的 PNG 文件，小于 4MB，并且与图像具有相同的尺寸。
model (str): 用于编辑图像的模型。目前仅支持 dall-e-2。
n (int): 要生成的图像数量。必须介于 1 和 10 之间。默认为 1。
size (str): 生成图像的尺寸。必须是 “256x256”、“512x512” 或 “1024x1024” 之一。较小的图像速度更快。默认为 “1024x1024”。
response_format (str): 返回生成图像的格式。必须是 “url” 或 “b64_json” 之一。默认为 “url”。
user (str): 代表您的最终用户的唯一标识符，这将帮助 OpenAI 监控和检测滥用行为。了解更多。

设置编辑区域

编辑需要一个“蒙版”来指定要重新生成图像的哪个部分。任何 alpha 值为 0（透明）的像素都将被重新生成。下面的代码创建了一个 1024x1024 的蒙版，其中下半部分是透明的。

# create a mask width = 1024 height = 1024 mask = Image.new("RGBA", (width, height), (0, 0, 0, 1)) # create an opaque image mask # set the bottom half to be transparent for x in range(width): for y in range(height // 2, height): # only loop over the bottom half of the mask # set alpha (A) to zero to turn pixel transparent alpha = 0 mask.putpixel((x, y), (0, 0, 0, alpha)) # save the mask mask_name = "bottom_half_mask.png" mask_filepath = os.path.join(image_dir, mask_name) mask.save(mask_filepath)

执行编辑

现在我们提供我们的图像、描述文字和蒙版给 API，以获得 5 个图像编辑示例

# edit an image # call the OpenAI API edit_response = client.images.edit( image=open(generated_image_filepath, "rb"), # from the generation section mask=open(mask_filepath, "rb"), # from right above prompt=prompt, # from the generation section n=1, size="1024x1024", response_format="url", ) # print response print(edit_response)

ImagesResponse(created=1701994167, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-9UOVGC7wB8MS2Q7Rwgj0fFBq.png?st=2023-12-07T23%3A09%3A27Z&se=2023-12-08T01%3A09%3A27Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A40%3A37Z&ske=2023-12-08T16%3A40%3A37Z&sks=b&skv=2021-08-06&sig=MsRMZ1rN434bVdWr%2B9kIoqu9CIrvZypZBfkQPTOhCl4%3D')])

# save the image edited_image_name = "edited_image.png" # any name you like; the filetype should be .png edited_image_filepath = os.path.join(image_dir, edited_image_name) edited_image_url = edit_response.data[0].url # extract image URL from response edited_image = requests.get(edited_image_url).content # download the image with open(edited_image_filepath, "wb") as image_file: image_file.write(edited_image) # write the image to the file