Cloudinary AI Content Analysis
Last updated: May-28-2026
Cloudinary is a cloud-based service that provides solutions for image and video management. These include server or client-side upload, on-the-fly image and video transformations, fast CDN delivery, and a variety of asset management options.
The Cloudinary AI Content Analysis add-on uses AI-based object detection and content-aware algorithms to provide the following functionality:
-
Object-aware cropping: Ensures that your image crops keep the specific objects that matter to you, even when you significantly modify the aspect ratio.
ImportantBy default, delivery URLs that use this add-on either need to be signed or eagerly generated. You can optionally remove this requirement by selecting this add-on in the Allow unsigned add-on transformations section of the Security page in the Console Settings.(For simplicity, most of the examples on this page show eagerly generated URLs without signatures.)
-
Automatic image tagging: Adds tags to your images based on objects or abstract concepts detected by the content-aware detection models specified on upload, or when invoked on images already stored in your product environment.
-
Image quality analysis: Analyzes the quality of an image.
The response includes the
iqa-analysisfield: -
Watermark detection: Detects banners and watermarks in images.
-
AI-based image captioning: Analyzes an image and suggests a caption to use appropriate to the image's contents.
Getting started
Before you can use the Cloudinary AI Content Analysis add-on:
You must have a Cloudinary account. If you don't already have one, you can sign up for a free account.
Register for the add-on: make sure you're logged in to your account and then go to the Add-ons page. For more information about add-on registrations, see Registering for add-ons.
Keep in mind that many of the examples on this page use our SDKs. For SDK installation and configuration details, see the relevant SDK guide.
If you're new to Cloudinary, you may want to take a look at the Developer Kickstart for a hands-on, step-by-step introduction to a variety of features.
Supported content-aware detection models
The Cloudinary AI Content Analysis add-on supports a number of built-in content-aware detection models, each supporting a specific set of categories and objects. You can specify which version of each model to invoke for each use of the add-on.
Cloudinary currently supports the following models:
| Model | Description |
|---|---|
| coco | The Common Objects in Context model contains just 80 common objects. |
| cld-fashion | Cloudinary's fashion model is specifically dedicated to items of clothing. Used with automatic image tagging, the response includes attributes of the clothing identified, for example whether the garment contains pockets, its material and the fastenings used. |
| lvis | The Large Vocabulary Instance Segmentation model contains thousands of general objects. |
| unidet | The UniDet model is a unified model, combining a number of object models, including Objects365, which focuses on diverse objects in the wild. |
| human-anatomy | Cloudinary's human anatomy model identifies parts of the human body in an image. It works best when the majority of a human body is detected in the image. |
| cld-text | Cloudinary's text model tells you if your image includes text, and where it's located. Used with automatic image tagging, you can then search for images that contain blocks of text. Used with object-aware cropping, you can choose to keep only the text part, or specify a crop that avoids the text. |
| shop-classifier | Cloudinary's shop classifier model detects if the image is a product image taken in a studio, or if it's a natural image. |
| image-type | Cloudinary's image type model detects generic properties about a photographic image, for example, photographic style, setting and time of the photo. |
| captioning | Cloudinary's captioning model is used to describe the contents of an image. See AI-based image captioning. |
| watermark-detection | Cloudinary's watermark detection model identifies if the image contains different types of watermark. See Watermark detection. |
| iqa | The Image Quality Analysis (IQA) model can predict the quality of a given image on a scale from 0 to 1 and provides a general quality estimation, categorized as 'low', 'medium', or 'high'. See Image quality analysis. |
Model capabilities
This table shows the capabilities of each supported version of each model:
- Default version is the version of the model that is invoked if left unspecified.
- Version indicates support for a particular version of the model - different versions have different accuracies.
-
Default confidence shows the confidence level used when auto_tagging is set to
default. - Tag indicates support for returning tags. This is a required capability for automatic image tagging.
- Confidence indicates support for returning confidence levels.
- Bounding Box indicates support for returning bounding boxes. This is a required capability for object-aware cropping.
- Attributes indicates support for returning attributes for each tag in a (key,value) list.
- If you are using our Asia Pacific data center, currently you can use only the COCO and Open Images models.
- If you have difficulty accessing any of the models, please contact support.
Supported objects and categories
Start typing the name of an object or category to see if it's supported by one of the built-in models.
-
- The Full URL Syntax column shows the syntax to use to detect a specific object or category in a particular version of a model (e.g.
coco_v2_tie). You can also omit the version (e.g.coco_tie), or both the model and version (e.g.tie).
- The Full URL Syntax column shows the syntax to use to detect a specific object or category in a particular version of a model (e.g.
-
- You can specify the model and version (e.g.
coco_v2), or only the model (e.g.coco).
- You can specify the model and version (e.g.
-
- Specify the object from the cld-fashion model (e.g.
g_track_person:obj_hat)
- Specify the object from the cld-fashion model (e.g.
Private models
If you have your own content-aware detection models that you would like to use, these can be integrated as private models that work only on your product environment. This service is provided for customers on Enterprise plans through Professional Services. Contact our Enterprise support and sales team or your CSM to find out more.
Object detection demo
This demo lets you choose one of the content-aware detection models, and shows up to twenty objects that are detected by that model in an image of your choice.
Automatic image tagging is requested on upload, and the response provides the necessary information to overlay bounding boxes around the detected objects, together with the confidence level.
- Read this blog to discover all the Cloudinary features in this demo.




