Skip to content

Generative AI for Complex Image Editing at Scale in Cloudinary’s Latest Innovations

Generative AI Features

Complex image editing processes typically require skilled expertise, expensive tools, and time — all scarce commodities for both enterprise and SMB teams dealing with increasingly tight budgets and timelines. Enter Cloudinary’s latest generative AI features that now perform complex image editing tasks like object removal, filling images with additional background, or adding contextual captions for SEO and accessibility in minutes. These innovations save time and resources while removing the bottlenecks to get compelling visual experiences to market faster.

Alongside Cloudinary’s innovations, the generative AI space in photo editing has seen significant advancements with tools from various companies. For instance, Adobe Photoshop offers some AI-driven capabilities, allowing users to add or remove content from images using text prompts. Google Photos Magic Editor uses a combination of AI techniques, including generative AI, to enable specific edits on parts of an image.

Our latest edition of monthly innovations also enables developers to improve client-side search performance by generating cacheable search URLs.

This development mirrors the broader industry’s move towards more efficient and accessible AI tools in photo editing. For example, tools like Picsart Ignite and Facet.ai offer suites of generative AI photo editing tools aimed at reducing editing time and effort. Additionally, the Imagen model from Vertex AI showcases the capabilities of text-to-image foundation models, further expanding the scope of generative AI in photo editing.

To see these features in action, visit Cloudinary Academy or our YouTube channel.

For a complete list of the latest innovations, check out our release notes.

Removing unwanted objects from an image to ensure focus on the primary subject or extending the background when cropping or changing image aspect ratios are often complex and time-consuming tasks. 

The new Generative Remove feature uses cutting-edge generative AI technology for automated object detection and intelligent pixel generation to remove unwanted objects in an image as though they never existed. Similarly, the Generative Fill feature intelligently fills blank padding with relevant background pixels when changing the image’s aspect ratio to ensure a visually appealing look.

While other creative tools offer such features, Cloudinary’s approach is highly differentiated because it enables technical teams, without the knowledge of complex creative tools, to perform such tasks at scale with simple programmatic URL-based solutions. 

Generative remove and fill features can be leveraged across many use cases, like:

  • Branded e-commerce imagery. Expanding images in product detail pages (PDPs) and removing unwanted accessories or props to put more focus on the primary product.
  • User-generated content. Removing third-party logos, photobombers, or anything else that may cause distraction 
  • Travel and hospitality. Enhance scenic imagery in marketing content by removing undesired elements such as vehicles or litter and expanding to a widescreen view 

Generative Remove example: Remove a third-party logo from the hat.

Generative Fill example: Fill additional background to seamlessly expand the model image.

To learn more about these transformations, check out our YouTube video.

Adding accurate and contextually relevant captions or alt tags to images is critical in optimizing search engine visibility and in ensuring compliance with web accessibility standards. Manual captioning is time-consuming and resource intensive. Traditional AI-based image tagging can also be highly error-prone as they are mostly trained on limited data. 

Powered by multimodal large language models (LLM), Cloudinary’s AI-powered Image Captioning feature goes beyond traditional solutions to provide contextually relevant captions that accurately describe an image. LLMs are trained on massive datasets that contain both images and text to produce impressive results. 

This feature is part of Cloudinary’s Content Analysis add-on and is available as part of the Programmable Media product. It can be used while uploading assets to Cloudinary and benefits teams by:

  • Increasing SEO ranking by ensuring accurate product descriptions. 
  • Meeting accessibility requirements by adding accurate image tags. 
  • Improving discoverability for internal teams by making assets easily searchable with contextually relevant tags.

AI Image Captioning example: 

A handbag with a purse, perfume, and other items.

For more about this feature, check out our YouTube video.

Developers often need to enable search experiences on websites or front-end applications that list specific images. For example, images of specific shoe models on a category page, or image galleries of the menu items on a food delivery site. Cloudinary offers a robust Search API that performs granular filtering and retrieving of assets in the product environment with the help of query expressions. 

The incorporation of cacheable search URLs in Cloudinary’s suite is a significant step towards enhancing search performance, mirroring advancements seen in other AI-powered photo editing tools that emphasize efficient and user-friendly access to AI functionalities.

To streamline use of this API for client-side search applications, you can now generate cacheable search URLs that can be easily embedded in any front-end application. The URLs are configurable to cache the results on the CDN for a specific amount of time after which the search results get regenerated. This improves search performance and saves developers time from building a caching mechanism for client-side search. 

For more about this feature, check out our YouTube video.

That wraps up our key innovations for this month. Stay tuned for more updates! 

If you’re new to Cloudinary and want to learn more, sign up for a free account.

Back to top

Featured Post