Cloudinary Blog

New Google-powered add-on for automatic video categorization and tagging

Introducing Automatic Video Tagging and Content Categorization

Due to significant growth of the web and improvements in network bandwidth, video is now a major source of information and entertainment shared over the internet. As a developer or asset manager, making corporate videos available for viewing, not to mention user-uploaded videos, means you also need a way to categorize them according to their content and make your video library searchable. Most systems end up organizing their video by metadata like the filename, or with user-generated tags (e.g., youtube). This sort of indexing method is subjective, inconsistent, time-consuming, incomplete and superficial.

A well-organized indexing system lets you easily manage and organize your media libraries:

  • Enable personnel across your entire organization to find resources they may need
  • Increase engagement by helping your users find exactly what they’re looking for
  • Help you connect your users with common interests and help them find other content that would interest them
  • Increase sales or advertising revenue by determining the main subjects that interest particular users and integrating this information with your existing analytics/personalization tools to display relevant product recommendations or adverts

But ultimately, any sort of manual video categorization process would take huge amounts of time and resources.

Introducing Cloudinary's Automatic Video Tagging add-on, powered by Google Cloud Video Intelligence, which is now fully integrated into Cloudinary's video management and delivery pipeline. State-of-the-art machine learning allows for the recognition of various visual objects and concepts in videos, simplifying and automating the categorization and tagging process.

Using the Automatic Video Tagging add-on

Take a look at the following video of horses:

Using the add-on, automatically assigning resource tags to the video is as simple as adding 2 parameters when either uploading a new video or updating an existing video: set the categorization parameter to google_video_tagging and the auto_tagging parameter to the minimum confidence score necessary before automatically adding a detected category as a tag. For example, uploading the horses video and requesting automatic tagging for all categories meeting a confidence score of over 40%:

   :resource_type => :video, :categorization => "google_video_tagging", :auto_tagging => 0.4)
  array("categorization" => "google_video_tagging", "auto_tagging" => 0.4));
  categorization = "google_video_tagging", auto_tagging = 0.4)
  function(result) { console.log(result); }, 
  { categorization: "google_video_tagging", auto_tagging: 0.4 });
cloudinary.uploader().upload("horses.mp4", ObjectUtils.asMap(
  "categorization", "google_video_tagging", "auto_tagging", "0.4"));

Once the categorization process completes, the information is returned to Cloudinary and all categories that exceed your specified confidence score are automatically added as tags on your video.


Below is a snippet of the upload response for the horse video:

tags"=>  ["animal", "freezing", "frost", "horse",  … ],
"info": {
   "google_video_tagging": {
      "status": "complete",
      "data": [
         [{"tag": "horse", 
          "start_time_offset": 0.0, 
          "end_time_offset": 12.6364, 
          "confidence": 0.8906},
          {"tag": "horse", 
          "start_time_offset": -1, 
          "end_time_offset": -1, 
          "confidence": 0.8906},
          {"tag": "animal", 
          "start_time_offset": 0.0, 
          "end_time_offset": 13.47364, 
          "confidence": 0.8906},


The benefits of video tagging

As can be seen in the example snippet above, various categories were automatically detected in the uploaded video and automatically added as tags. Each category is listed together with the start and end times of the relevant video segment (an offset time of -1 means the category represents the entire video) and the confidence score of the detected category, where 1.0 means 100% confidence.

Once the video has been categorized, that information can be shared with your analytics tools. Cross-examining both the categorization and usage data can yield valuable insights into how different videos impact engagement and conversion. Do the videos show indoor or outdoor scenes? Do they include people? Animals? This information can then be leveraged for AB testing and user profiling.

For example, you can test how different videos, (e.g., with or without animals) may impact engagement for a specific product or service, helping you utilize the optimal content when designing websites, apps or email campaigns. You may determine that a user watching videos of parties, events, sports, and music is probably a college student or young adult, whereas a user that uploads videos of parks, children, and playgrounds is more likely to be a parent. This knowledge can help you focus your content on the right audience and increase engagement and conversion.

Additionally, a well indexed, organized library of videos can be leveraged across your entire organization. Tagging is particularly useful if your company has a constantly growing library of digital assets that need to be made available for various teams within your organization. For example, if the marketing team needs a video of a dog for an email campaign, they can search for and select the most appropriate video.

See automatic tagging in action Visit Cloudinary's Video Transcoding demo where you can check out the results of the automatic tagging add-on for a number of sample videos or even upload your own. You can also see examples of a variety of advanced video transformations as well as a demonstration of the Video Transcription add-on.


The Google-powered Automatic Video Tagging add-on provides you with meaningful data extracted from videos. Take advantage of that data to make strategic business decisions that could improve your users’ experience and drive greater profits. Cloudinary’s service, together with the fully integrated Automatic Video Tagging add-on, provides you with the powerful ability to streamline your content management as well as increase your users’ engagement and conversion.

video tagging

The add-on is available with all Cloudinary plans and offers a free add-on tier for you to try out. If you don't have a Cloudinary account yet, sign up for a free account.

Recent Blog Posts

Hipcamp Optimizes Images and Improves Page Load Times With Cloudinary

When creating a website that allows campers to discover great destinations, Hipcamp put a strong emphasis on featuring high-quality images that showcased the list of beautiful locations, regardless of whether users accessed the site on a desktop, tablet, or phone. Since 2015, Hipcamp has relied on Cloudinary’s image management solution to automate cropping and image optimization, enabling instant public delivery of photos, automatic tagging based on content recognition, and faster loading of webpages. In addition, Hipcamp was able to maintain the high standards it holds for the look and feel of its website.

Read more
New Image File Format: FUIF: Why Do We Need a New Image Format

In my last post, I introduced FUIF, a new, free, and universal image format I’ve created. In this post and other follow-up pieces, I will explain the why, what, and how of FUIF.

Even though JPEG is still the most widely-used image file format on the web, it has limitations, especially the subset of the format that has been implemented in browsers and that has, therefore, become the de facto standard. Because JPEG has a relatively verbose header, it cannot be used (at least not as is) for low-quality image placeholders (LQIP), for which you need a budget of a few hundred bytes. JPEG cannot encode alpha channels (transparency); it is restricted to 8 bits per channel; and its entropy coding is no longer state of the art. Also, JPEG is not fully “responsive by design.” There is no easy way to find a file’s truncation offsets and it is limited to a 1:8 downscale (the DC coefficients). If you want to use the same file for an 8K UHD display (7,680 pixels wide) and for a smart watch (320 pixels wide), 1:8 is not enough. And finally, JPEG does not work well with nonphotographic images and cannot do fully lossless compression.

Read more
 New Image File Format: FUIF:Lossy, Lossless, and Free

I've been working to create a new image format, which I'm calling FUIF, or Free Universal Image Format. That’s a rather pretentious name, I know. But I couldn’t call it the Free Lossy Image Format (FLIF) because that acronym is not available any more (see below) and FUIF can do lossless, too, so it wouldn’t be accurate either.

Read more