AI advancements have led to a surge in personalized experiences. Instead of using generic avatars, you can now create a unique digital persona from just a text description. This empowers developers to go beyond traditional avatar solutions, offering users a richer, more interactive experience.
This blog post guides you through building a Next.js application that utilizes Dall-E 3 and Cloudinary to create and store avatars.
Cloudinary securely stores your creations, optimizing them for fast loading and various devices through its global content delivery network (CDN) — all in one streamlined workflow.
The complete source code of this project is on GitHub and here’s what it looks like.
Dall-E 3 by OpenAI is an AI image generation model that creates realistic images based on text descriptions. The text is converted into a numerical code; then, it’s fed into a powerful image generation model, which utilizes its vast training in text-image pairs to understand the connection between words and visuals.
The model produces a clear and detailed image, ensuring it aligns with the user’s description. Remarkably, DALL-E 3 can generate multiple variations of an image with a single prompt, offering users a range of creative options.
Currently, Dall-E 3 isn’t part of the OpenAI API for free public use, as access is currently limited to ChatGPT Plus, Team, and Enterprise users.
To get the most out of this article, you must have the following:
- A basic understanding of TailwindCSS, Next.js, and Typescript.
- A free Cloudinary account.
- A ChatGPT Plus account to access Dall-E 3.
With your Cloudinary and ChatGPT Plus account ready, bootstrap a Next.js app with the command below:
npx create-next-app@latest
Code language: CSS (css)
As you receive prompts to set up the app, make sure to select Typescript, Tailwind CSS, and App Router, as they’re essential for building this project. After making your selections, install the necessary dependencies using the following command.
npm install cloudinary react-copy-to-clipboard @types/react-copy-to-clipboard react-icons openai
Code language: CSS (css)
These installed packages perform the following tasks:
cloudinary
. Provides an interface to interact with Cloudinary.react-copy-to-clipboard
. Provides a React component that enables users to copy texts with a single click.@types/react-copy-to=clipboard
. This is a type definition file for the react-copy-to-clipboard package.react-icons
. Offers a collection of icons (e.g., Font Awesome or Material UI) that can be easily integrated into your application.openai
. Interacts with the OpenAI API.
Once done, head to the next.config.js
file in the project’s root directory and add this code snippet to allow Next.js to display images generated from Cloudinary:
/** @type {import('next').NextConfig} */
const nextConfig = {
images: {
formats: ["image/avif", "image/webp"],
remotePatterns: [
{
protocol: "https",
hostname: "res.cloudinary.com",
},
],
},
};
export default nextConfig;
Code language: JavaScript (javascript)
This prioritizes AVIF and WebP formats for image optimization. It also allows Next.js to display images served from the endpoint that Cloudinary’s API uses.
Head to Cloudinary and access the dashboard to find all the required credentials for this project. Copy and store them somewhere safe; you’ll need them for this project.
Next, navigate to the OpenAI API key page and create a new key, as shown below. The key allows you to interact with Dall-E 3. Copy and store them somewhere safe too.
Once you have your OpenAI API key and Cloudinary credentials, create a file named .env.local
in your project’s root directory and store them.
// .env.local
NEXT_PUBLIC_OPENAI_API_KEY=<openai_api_key>
NEXT_PUBLIC_CLOUDINARY_CLOUD_NAME=<cloudinary_api_key>
NEXT_PUBLIC_CLOUDINARY_API_KEY=<cloudinary_api_secret>
CLOUDINARY_API_SECRET=<cloudinary_cloud_name>
Code language: HTML, XML (xml)
Never share your credentials publicly.
For the sake of this post, we limit the app’s functionality to the user avatar generation system, which you can eventually apply or embed in your application.
The App component handles user input (prompts and names) and displays the generated avatar with a user’s name. Navigate to the src/app/pages
file and update it to the code block below.
// src/app/pages
import { FaRegPaperPlane } from "react-icons/fa";
export default function Home() {
const [value, setValue] = useState<string>("");
const [firstName, setFirstName] = useState<string>("");
const [lastName, setLastName] = useState<string>("");
return (
<main className="flex min-h-screen flex-col items-center justify-between p-24">
<h1 className="relative text-3xl font-semibold capitalize ">
Avatar image manager
</h1>
<div>
<div className="grid grid-cols-2 gap-2">
<input type="text" placeholder="First Name" name="value" onChange={(e) => { setFirstName(e.target.value) }} className="bg-gray-100 placeholder:text-gray-400 disabled:cursor-not-allowed border border-gray-500 text-gray-900 text-sm rounded-lg block p-3.5 mr-2 w-[290px]" required />
<input type="text" placeholder="Last Name" name="value" onChange={(e) => { setLastName(e.target.value) }} className="bg-gray-100 placeholder:text-gray-400 disabled:cursor-not-allowed border border-gray-500 text-gray-900 text-sm rounded-lg block p-3.5 mr-2 w-[290px]" required />
</div>
<div className="flex items-center justify-between">
<input type="text" placeholder="Enter an image prompt" name="value" onChange={(e) => { setValue(e.target.value) }} className="bg-gray-100 placeholder:text-gray-400 disabled:cursor-not-allowed border border-gray-500 text-gray-900 text-sm rounded-lg block p-3.5 mr-2 w-[600px]" required />
<button className="text-white bg-blue-700 font-medium rounded-lg text-sm transition-all sm:w-auto px-5 py-3 text-center" onClick={handlePrompt}>
Send
</button>
</div>
</div>
</main>
)
}
Code language: JavaScript (javascript)
The component renders user interface elements for name and text prompts. It uses input fields to capture user input and a button to submit these values to the OpenAI API. Additionally, the component manages the state variables value, firstName, and lastName to store the current user input values. It should look like this:
To provide users with visual feedback during prompt processing, create a Loader component file, src/components/Loader.tsx
, with the code snippet:
//src/components/Loader.tsx
const Loader = () => {
return (
<div className="flex justify-center items-center">
<div className="spinner">
</div>
</div>
);
}
export default Loader;
Code language: JavaScript (javascript)
Then, navigate to the global.css file and add the Loader component’s CSS style.
//src/app/global.css
@keyframes spinner {
0% {
transform: rotate(0deg);
}
100% {
transform: rotate(360deg);
}
}
.spinner {
border: 4px solid #f3f3f3;
border-top-color: #3498db; /* Adjust color as desired */
border-radius: 50%;
animation: spin 1s linear infinite;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
Code language: PHP (php)
In this step, you’ll send the user’s prompt to OpenAI’s API to generate an image, then enhance it using Cloudinary transformations before storing it. The prompt will be sent as a body parameter to the /api/generate endpoint on the server. The response from the endpoint containing the transformed image URL will then be retrieved and processed.
For the next step, you’ll use Next.js’ API routes to handle API requests.
First, you’ll create a file to serve as an API route for your POST request. To do this, create a folder in the src
directory called pages/api. In the pages/api
folder, add a new file named generate.ts. Then add this code sample:
//src/pages/api/generate.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import OpenAI from "openai";
const openai = new OpenAI({ apiKey: process.env.NEXT_PUBLIC_DALL_E_3_API_KEY });
export default async function handler(req: NextApiRequest, res: NextApiResponse) {
try {
const { prompt } = req.body; // Extract the prompt from the request body
// Generate image using OpenAI
const generatedImage = await openai.images.generate({
model: "dall-e-3",
prompt: prompt,
n: 1,
size: "1024x1024",
});
// Extract image URL
const imageUrl = generatedImage.data[0].url;
} catch (error: any) {
console.error(error);
res.status(500).json({ message: error.message });
}
}
Code language: JavaScript (javascript)
Here’s a breakdown of the code snippet above:
- Line 1:Shows types imported from Next.js to define the structure of the request and response objects.
- Lines 2-3:Creates a new OpenAI client instance to interact with OpenAI API.
- Line 7: Extracts the prompt property from the request body.
- Lines 9-14: Makes an asynchronous call to the OpenAI.images.generate method, specifying Dall-E 3 as the model, using the extracted prompt, requesting one image, and setting the desired size of the image to 1024×1024 pixels.
- Line 16:Extracts the image URL from OpenAI API’s successful response.
- Lines 17-20: Catches errors during API calls or processing.
The next step involves applying some transformations to the image, uploading it to Cloudinary for storage, and displaying the transformed version and its upload URL. For this, you’ll update the generate.ts
file as shown below.
//pages/api/generate.ts
import type { NextApiRequest, NextApiResponse } from 'next';
import OpenAI from "openai";
+ import { v2 as cloudinary } from 'cloudinary';
+ cloudinary.config({
+ cloud_name: process.env.NEXT_PUBLIC_CLOUDINARY_CLOUD_NAME,
+ api_key: process.env.NEXT_PUBLIC_CLOUDINARY_API_KEY,
+ api_secret: process.env.CLOUDINARY_API_SECRET,
+ });
const openai = new OpenAI({ apiKey: process.env.NEXT_PUBLIC_DALL_E_3_API_KEY });
export default async function handler(req: NextApiRequest, res: NextApiResponse) {
try {
const { prompt } = req.body;
const generatedImage = await openai.images.generate({
model: "dall-e-3", prompt: prompt, n: 1, size: "1024x1024",
});
// Extract image URL
const imageUrl = generatedImage.data[0].url;
// Generate a unique public ID for Cloudinary
+ const timestamp = req.body.prompt + ' ' + Math.floor((Math.random() * 100) + 1);
+ const publicId = timestamp.trim().replace(/\s+/g, '-');
// Upload image to Cloudinary with transformations
+ const response = await cloudinary.uploader.upload(imageUrl, {
+ transformation: [
+ { width: 200, height: 200, gravity: "face", radius: "max", border: "2px_solid_blue", crop: "thumb" }
+ ],
+ resource_type: 'image',
+ public_id: `${publicId}`,
+ });
// Extract uploaded image URL from Cloudinary response
+ const uploadedImageUrl = response.secure_url;
// Send successful response with uploaded image URL
+ res.status(200).json({ image_url: uploadedImageUrl });
} catch (error) {
console.error(error);
res.status(500).json({ message: error.message });
}
}
Code language: JavaScript (javascript)
The code sample above do the following:
- Lines 4-9: Imports the Cloudinary Node.js SDK (version 2) and configures it using the environment variables.
- Line 20 and 21: Generates a unique identifier for the uploaded image by combining the user’s prompt with random numbers, removing spaces, and replacing them with hyphens for a valid Cloudinary ID.
- Lines 23-29: Handles uploading the image to Cloudinary, specifying it as an image resource, and assigning a unique public ID. It also applies various transformations before storage. Cloudinary significantly enhances the image manipulation process, offering transformations beyond what simple prompts can achieve. Here’s a breakdown of the transformations:
width: 200, height: 200
resizes the image to 100×100 pixels.gravity: "face"
centers the image on a face, if there’s any detected.radius: "max"
crops the image to the shape of an ellipse or a circle.border: "2px_solid_blue"
adds a solid blue border of 2 pixels.crop: "thumb"
creates a thumbnail for the image.
- Lines 31 and 33: Extracts the secure URL of the uploaded image from the Cloudinary upload response and sends it as a successful response in a JSON object.
- Lines 34-37: Catches any errors that might occur during the upload process.
Next, you’ll need to send data to the API route. Head back to your src/app/pages
file, create a function called handlePrompt, and trigger the function on the button element’s click. Then, convert the value state variable (containing the prompt) into a JSON object and set it as the API request body.
To handle the response from the API route, you’ll store the transformed image URL from the response in a state variable named URL. You’ll also create a function, handleCharacters
, and a state variable, characters, to extract and store the user’s name. Do these with the code snippet below:
// src/app/pages
import { FaRegPaperPlane } from "react-icons/fa";
export default function Home() {
const [url, setUrl] = useState<string>("");
const [loading, setLoading] = useState<boolean | undefined>(false);
const [characters, setCharacters] = useState<string>("");
const [value, setValue] = useState<string>("");
const [firstName, setFirstName] = useState<string>("");
const [lastName, setLastName] = useState<string>("");
const handleCharacters = () => {
const char = firstName + " " + lastName
setCharacters(char)
}
const handlePrompt = async (e: any) => {
e.preventDefault()
setLoading(true)
try {
const response = await fetch('/api/dalle3', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ prompt: value }), // send the string as a JSON object
});
// Handle success, such as updating UI or showing a success message
if (response.ok) {
const data = await response.json();
setUrl(data.image_url)
setValue("")
handleCharacters()
}
} catch (error) {
// Handle network errors or other exceptions
console.error('Error uploading file:', error);
}
};
return (
<main className="flex min-h-screen flex-col items-center justify-between p-24">
... // App Title
<button className="text-white bg-blue-700 font-medium rounded-lg text-sm transition-all sm:w-auto px-5 py-3 text-center" onClick={handlePrompt}>
Send
</button>
</main>
)
}
Code language: JavaScript (javascript)
The handleCharacters function takes the firstName and lastName values, combines them into a single string, and updates the characters state variable to reflect the concatenated name.
Next, you’ll display the user’s name, transformed image, and image URL only when the response from the server is successful. To achieve this, render the image and name only when the url state variable has a value.
Then, hide the loading indicator using the onLoadingComplete
prop, indicating successful image retrieval. This prop will also update the button’s content based on whether the application has processed the request.
// src/app/pages
import Image from "next/image";
import { useState } from "react";
export default function Home() {
const [url, setUrl] = useState<string>("");
const [loading, setLoading] = useState<boolean | undefined>(false);
return (
<main className="flex min-h-screen flex-col items-center justify-between p-24">
... //App Title
{url &&
<div className="flex flex-col items-center justify-center">
<div className="flex justify-center items-center gap-4">
<Image src={url} onLoadingComplete={() => setLoading(false)} width={192} height={192} alt="ai image" />
{!loading && <h1 className="text-4xl ">{characters}</h1>}
</div>
</div>
}
... //input
<button className="text-white bg-blue-700 font-medium rounded-lg text-sm transition-all sm:w-auto px-5 py-3 text-center" onClick={handlePrompt}>
{loading ? <Loader /> : "Send"}
</button>
</main>
)
}
Code language: JavaScript (javascript)
Next, you’ll display the transformed image URL. To display the image URL:
- Use a ternary operator to render an input field only when the loading state variable is false, including a button to copy the input value.
- Bind the input value to the
url
state variable to update the input field with the image URL. - Enclose the button element in a
CopyToClipboard
component. This component allows a user to copy the input value (image URL) to the clipboard. The component takes two props:text
(string) defines the text to be copied (typically the transformed image URL), andonCopy
(function) triggers a function calledonCopyText
whenever the copy operation is successful.
The function onCopyText
will display a visual confirmation message once a user copies the URL. Here’s what it looks like.
// src/app/pages
import Image from "next/image";
import { useState } from "react";
export default function Home() {
const [loading, setLoading] = useState<boolean | undefined>(false);
const [copyStatus, setCopyStatus] = useState(false);
// saveImage Function
// handlePrompt
const onCopyText = () => {
setCopyStatus(true);
setTimeout(() => setCopyStatus(false), 2000); // Reset status after 2 seconds
};
return (
<main className="flex min-h-screen flex-col items-center justify-between p-24">
... //App Title
{url &&
... // Image and User Name
{!loading &&
<div className="mt-8">
<div className=" flex justify-center items-center gap-2 mt-2">
<p className="text-sm font-semibold">Image URL:</p>
<input id="clipboard" value={url} className="bg-gray-100 border border-gray-500 text-gray-900 text-sm rounded-lg p-1.5 w-[340px]" />
<CopyToClipboard text={url} onCopy={onCopyText}>
<button className="text-sm font-semibold text-blue-500">
<FaRegCopy />
</button>
</CopyToClipboard>
</div>
{copyStatus && <p className="text-center">Text copied to clipboard!</p>}
</div>
}
</div>
}
... //Input and Button
</main>
)
}
Code language: PHP (php)
The app should look similar to this when you enter a name and a prompt with “a siamese cat”:
After setting up the app, users can enter a text prompt describing their desired image, and Dall-E 3 will generate a unique image based on their input. Then, the image undergoes transformations using Cloudinary before being stored for later use. The application will look like this.
While Cloudinary is great at storing AI-generated images, it also ensures they are delivered quickly over a CDN, enhancing your application’s performance. Other Cloudinary features are important for a smooth user experience, including:
- Automatic optimizations. Reduced storage footprint and faster loading times.
- On-the-fly transformations. Image delivery in various forms and sizes.
On-the-fly transformations to uploaded AI-images is a personal favorite.
This blog post demonstrates how to generate AI images through Dall-E 3 and upload them to Cloudinary. This streamlined workflow empowers users to generate, modify, and store AI images in a single platform. Eventually, you’ll embed this function in your application or build a standalone product with it. To learn more about how integrating advanced solutions like Cloudinary’s CDN brings a new level of optimization and customization to content delivery, contact us today.
If you found this post helpful and would like to learn more, feel free to join the Cloudinary Community forum and its associated Discord.