Image generation with Open AI Dall-E APIs

07/05/2023 openai dall-e generativeAI

In this post I will show how to use Open AI's generative AI APIs to create images based on your prompt.

If you are new to using the OpenAI APIs, check out my last post introducing the APIs and exploring the Postman collection.

What is Dall-e?

Open AI website describes it as: "DALL·E 2 is an AI system that can create realistic images and art from a description in natural language."

OpenAI APIs

Have a look in the Postman API collection, the Images folder has 3 examples: generation, edit and variation.

Using the Generation API

Note: at time of writing the latest version is v1 and these examples are based on those.

Lets start with the generation API endpoint. As the name suggests it's to generate images. We can give a text prompt and get image(s) returned.

POST

/v1/images/generations

parameters:

  • prompt - a required text field describing the image(s) you want to create.
  • n - optional field for the number of images to create.
  • size - optional field determining the size of response image. Must be one of 256x256, 512x512, or 1024x1024. Defaults to 1024x1024
  • response_format -  optional field determining the format in which the generated images are returned. Must be one of url or b64_json, defaults to url.
  • user -  optional unique id field identifying your user.

For example, this request would request 2 images at 1024x1024 of a cute black cat:

{

"prompt": "A cute black cat",
"n": 2,
"size": "1024x1024"

}

Let's go over to our Postman collection and try it out (see previous blog post if this is new to you). Update the body json in the Create Image example to your request per the parameters above. For example:

This gives a response with 2 image URLs, here are the images generated.

Now, we can edit our prompt to tune the images we need. For example, if I want a photo realistic style image (prompt = "A cute black cat in a photo realistic style") :

or a water colour style (prompt =  "A water colour style painting of a cute black cat"):

You can optimise your image based on prompts, which is becoming a specific skill and expertise area called "prompt engineering". I'd recommend doing some reading and exploring what works and doesn't in the responses.

Using the Variation API

If you have an image you like the style of but would like some other options, you can upload it and get alternatives using the variations API endpoint.

POST

/v1/images/variations

parameters:

  • image - a required field with an image the variations will be based on. Must be a valid PNG file, less than 4MB, and square.
  • n - optional field for the number of images to create.
  • size - optional field determining the size of response image. Must be one of 256x256, 512x512, or 1024x1024. Defaults to 1024x1024
  • response_format -  optional field determining the format in which the generated images are returned. Must be one of url or b64_json, defaults to url.
  • user -  optional unique id field identifying your user.

For example, I provided one of my previous images and it provided these variations:

AI Generated Images and Copyright

This is an ongoing discussion on how AI is trained based on copyrighted images owned by others and also how AI generated images can (or can't) be copyrighted. I suspect this will continue to evolve over the next months and years as Generative AI continues to grow. At time of writing here are some useful articles to learn more:

---- 

Hopefully this was a useful intro to the Dall-e APIs to let you get started with AI image generation!