Accessible Videos Are the Next SEO Hack. Here’s How to Do It Right.

Why It Matters

With Google, social media, and AI-powered answers prioritizing video content, making videos discoverable and accessible is now a critical SEO strategy.
To rank well, videos need a text layer and structured data that machines can crawl, e.g., captions, transcripts, metadata, and structured data.
The guide provides a step-by-step process for automating video metadata at the time of upload using the Cloudinary SDK.

Video content is exploding across search, social media, and AI-generated answers, but few optimize for discoverability or accessibility.

When people think of SEO, the first things that come to mind might be text based: keywords, backlinks, page speed, etc., but Google search has changed. AI-powered answers, video-first social feeds, and platforms like LinkedIn are all prioritizing video.

To stand out, your videos need to be structured and accessible with captions, transcripts, metadata, and tags that machines can actually read. These not only give Google more to crawl, but they also help you reach wider audiences, boost engagement, and signal that you’re creating inclusive content.

This guide walks you through how to make video assets SEO-ready and accessible using Cloudinary.

How to Automate Alt Text, Tagging, and Structured Data With Cloudinary

Uploading videos is straightforward, but making them searchable, discoverable, and accessible to search engines is where most fall short.

Cloudinary can help automate many of these tasks, such as making it easy to add relevant tags to your video using the Google Video AI add-on. You can also add alt text and captions to your videos when uploading.

For this post, we’ll use the Cloudinary SDK to handle uploads. You can download it using the command below:

<code>npm install cloudinary</code>Code language: HTML, XML (xml)

You’ll need a Cloudinary account to access your credentials, which allow you to interact with the Cloudinary SDK. If you don’t have an account yet, sign up at Cloudinary.

After signing in, go to your Cloudinary dashboard and copy the following details: Cloud Name, API Key, and API Secret.

Product Environment Credientials with the Cloud name, API key, and API secret blanked out. Underneath it is Insights & Recommendations, which says: AUtomatically focus on the most interesting part. Use g_auto AI to focus on the most important part of each image and video frame. — Cloudinary API keys from its dashboard

You can store these details in environment variables and configure Cloudinary at the top of your script like this:

import cloudinary from 'cloudinary';

cloudinary.config({
  cloud_name: CLOUDINARY_CLOUD_NAME,
  api_key: CLOUDINARY_API_KEY,
  api_secret: CLOUDINARY_API_SECRET,
});Code language: JavaScript (javascript)

With all of this set up, let’s now see how to upload a video and add both contextual metadata and categorization tags.

To start, use the same cloudinary instance you configured earlier to upload a video and pass metadata using the context field. This allows you to add features such as alt text and captions.

cloudinary.uploader.upload('video.mp4', {
  resource_type: 'video',
  context: 'caption=This is a test video|alt=Test Video'
})Code language: JavaScript (javascript)

The example above adds the caption and alt text, separated by the | character, all in one string. Cloudinary will split this up and store it like this:

"context": {

        "custom": {

            "alt": "Test Video",

            "caption": "This is a test video"

        }

    },Code language: JavaScript (javascript)

You can also auto-generate tags using the Google Video Tagging Add-on. To use this add-on, search for the Google AI Video Transcription add-on on your Cloudinary dashboard:

Available add-ons with a red box around Google Auto Tagging. — Google Video Tagging add-on

Click the add-on and subscribe to the free plan:

In Google Automatic Video Tagging, the Free plan is highlighted. — Subscribe to the free plan

Now, when you attach categorization: ‘google_video_tagging’ in your upload, Cloudinary will generate tags like this:

"info": {

    "categorization": {

      "google_video_tagging": {

        "data": [

          { "tag": "solar power", "confidence": 0.98 },

          { "tag": "cityscape", "confidence": 0.91 }

        ]

      }

    }

  },Code language: JavaScript (javascript)

These tags are useful for organizing your video library, enabling search filters, and improving semantic SEO.

Once your video is uploaded, Cloudinary stores everything. You (or your dev) can then grab this data by calling Cloudinary’s Admin API:

GET https://api.cloudinary.com/v1_1/<cloud_name>/resources/video/upload/<public_id>Code language: HTML, XML (xml)

And you’ll get a detailed response that looks like this:

{

  "secure_url": "https://res.cloudinary.com/.../video.mp4",

  "context": {

    "custom": {

      "alt": "Test Video",

      "caption": "This is a test video"

    }

  },

  "info": {

    "categorization": {

      "google_video_tagging": {

        "data": [

          { "tag": "solar power", "confidence": 0.98 },

          { "tag": "cityscape", "confidence": 0.91 }

        ]

      }

    }

  },

  "duration": 196,

  "width": 1280,

  "height": 720

}Code language: JSON / JSON with Comments (json)

Now you’ve got everything you need to describe the video in a way search engines understand. The next step is to wrap it in structured data (JSON-LD).

Structured data is a special format that gives search engines extra context about what’s on your page. For video content specifically, there’s a dedicated format called VideoObject.

This schema markup is written in JSON-LD (JavaScript Object Notation for Linked Data). It sits inside a <script type="application/ld+json"> block in your HTML and provides structured data that search engines can read without rendering the page.

You wrap it all up in a <script> tag like this:

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "VideoObject",

  "name": "This is a test video",

  "description": "Test Video",

  "thumbnailUrl": "https://res.cloudinary.com/your_cloud/video/upload/thumbnail.jpg",

  "uploadDate": "2025-06-12",

  "contentUrl": "https://res.cloudinary.com/your_cloud/video/upload/video.mp4",

  "duration": "PT3M16S",

  "keywords": ["solar power", "cityscape"]

}

</script>Code language: HTML, XML (xml)

Let’s break it down:

@type: VideoObject tells Google, “Hey, this is a video.”
name and description are pulled from your caption and alt text.
contentUrl is the direct link to the video file.
keywords are those auto-generated tags from Google’s video AI.
duration is how long the video is (formatted as ISO duration: PT3M16S = 3 min, 16 sec).

You can inject this block into the page using your CMS, a frontend framework like Next.js, or just by pasting it into the HTML manually.

Embedding Transcripts and Captions

In addition to adding metadata to your video, captions and transcripts are also very important. They make your video accessible for people watching on mute, users with hearing impairments, and search engines or AI tools trying to parse your content.

You may wonder what the difference is between captions and transcripts:

Captions are time-synced text shown during playback.
Transcripts are the full, unsynced audio in text form.

Transcripts are especially great for SEO, skimming, and summarization.

If you don’t already have a caption file, Cloudinary can generate one using the Google AI Video Transcription add-on. Search for it on the Add-ons page and subscribe to the free plan:

The Google AI Video Transcription's Add-on Plan options. The Free plan is highlighted. — Subscribe to the free plan of Google AI Video Transcription

You can now modify your upload logic to request a subtitle file from Google’s Speech-to-Text API using the raw_convert parameter:

<script type="application/ld+json">

{

  "@context": "https://schema.org",

  "@type": "VideoObject",

  "name": "This is a test video",

  "description": "Test Video",

  "thumbnailUrl": "https://res.cloudinary.com/your_cloud/video/upload/thumbnail.jpg",

  "uploadDate": "2025-06-12",

  "contentUrl": "https://res.cloudinary.com/your_cloud/video/upload/video.mp4",

  "duration": "PT3M16S",

  "keywords": ["solar power", "cityscape"]

}

</script>Code language: HTML, XML (xml)

This tells Cloudinary to send the audio to Google Speech-to-Text and return a .vtt subtitle file you can use with any video player.

Once the .vtt file is ready, include it in your <video> element like so:

<video controls crossOrigin="anonymous" class="w-full max-w-md border-4 rounded">

  <source src="https://res.cloudinary.com/your_cloud/video/upload/solar_video_demo.mp4" type="video/mp4" />

  <track

    label="English"

    kind="subtitles"

    srcLang="en"

    src="https://res.cloudinary.com/your_cloud/raw/upload/solar_video_demo.vtt"

    default

  />

</video>Code language: HTML, XML (xml)

This gives you a video that plays normally, with toggleable subtitles for accessibility and UX, and better SEO, because now your video has a discoverable text layer.

To boost your SEO even more, don’t stop at captions. Drop the transcript right below your video as readable HTML:

<video controls crossOrigin="anonymous" class="w-full max-w-md border-4 rounded">

  <source src="https://res.cloudinary.com/your_cloud/video/upload/solar_video_demo.mp4" type="video/mp4" />

  <track

    label="English"

    kind="subtitles"

    srcLang="en"

    src="https://res.cloudinary.com/your_cloud/raw/upload/solar_video_demo.vtt"

    default

  />

</video>Code language: HTML, XML (xml)

This helps Google index content like blog text and lets users skim instead of watching the full video. If you used the google_speech add-on, you already have the text. Convert the .vtt file to plain text or display it in its original format.

Testing if Your Video SEO and Accessibility Setup Works

Once all of this is in place, you want to make sure your setup actually meets SEO and accessibility standards.

Start with Google’s Rich Results Test tool. It checks whether your structured data (like VideoObject) is valid and eligible for enhanced search features.

To use it:

Go to Google’s Rich Results Test.
Paste the URL of your video page.
Click the Test URL button.

If the video markup is valid, you’ll see a Videos section among the results. If anything is missing, like no contentUrl, missing duration, or invalid @context, the tool will point it out.

The Videos section: 4 valid items detected. The details section underneath has crawl history and detected items history. — Rich Results Test result example

You can also use Lighthouse in Chrome DevTools to test for accessibility issues that affect real users (and sometimes SEO too).

To use it:

Open your video page in Chrome.
Right-click and choose Inspect.
Click the Lighthouse tab.
Click the Analyze page load button.

Lighthouse in Chrome DevTools to test for accessibility. The modes are: Navigation (Default), Timespan, and Snapshot. Device options are Mobile and Desktop. Categories options are Performance, Accessibility, Best practices, and SEO. There's a warning note: There may be stored data affecting loading performance in this location: IndexedDB. Audit this page in an incognito window to prevent those resources from affecting your scores. — Running Lighthouse in Chrome

Lighthouse will scan everything. Your goal is to get the Accessibility score as close to 100 as possible. Here’s an example of what the results might look like:

Lighthouse report showing a strong Accessibility score of 100 and a detailed issue breakdown. — Lighthouse report showing a strong Accessibility score and detailed issue breakdown

You’ll see a breakdown of passed checks and any accessibility issues detected, like missing labels, insufficient contrast, or unsupported video elements. Fixing these not only improves SEO but also creates a better experience for everyone.

How to Build a Media Library That Supports SEO Performance Goals

So far, we’ve covered the basics like adding captions, transcripts, structured data, and metadata. But what happens when you’re managing hundreds of videos?

You need a repeatable system. That means properly tagging every video uploaded to Cloudinary, storing metadata at upload, and automating as much of this as possible.

One way to do this is to use Cloudinary MediaFlows. This ensures that every uploaded video is automatically processed and enriched without manual intervention.

Cloudinary’s MediaFlows lets you automate your assets (images and videos) by creating workflows that handle these tasks behind the scenes:

EasyFlows. A simple, no-code automation tool for predefined actions.
PowerFlows. A more advanced, flexible builder that allows for custom logic.

To access MediaFlows, go to your Cloudinary console and click the MediaFlows logo as shown below:

In Cloudinary, a red arrow points to the MediaFlows icon on the left-hand navigation panel. — Accessing MediaFlows in Cloudinary

The MediaFlows dashboard will open, where you can create workflows. Since you need something that dynamically triggers video processing on upload, choose PowerFlows.

Click the + Create New button, select PowerFlow, and the UI will open, where you can drag and configure blocks for your workflow.

On the left-hand side, you’ll see a list of blocks for actions, add-ons, APIs, and more. Search through this list to configure your flow visually.

MediaFlows blocks:
Triggers: Cloudinary Upload, On Structured Metadata Change, On Tag Change, Cloudinary Folder Changed, Catch Webhook, Periodic Search Media, Scheduler, DAM App, Cloudinary Notification. Flow Logic: Complex Condition, Wait For All, Condition, and more outside of the screenshot. — MediaFlows blocks.

For this guide, you’ll use three blocks:

Cloudinary Upload
Update Contextual Metadata
Google Video Tagging Add-on

Step 1: Detect Video Uploads

Drag the Cloudinary Upload block to the workspace. This block will listen for new uploads.

To make the workflow react to uploads, you need to set up a webhook. Click the Set Webhook button and enable it. Once done, any time a video is uploaded, the webhook will trigger this workflow.

Configurable webhook for Cloudinary Upload block. A red arrow points to Set Webhook button. — Configuring webhook for Cloudinary Upload block

Step 2: Add Metadata to the Uploaded Video

Next, drag the Update Contextual Metadata block into the workspace and connect it to the Cloudinary Upload block.

Click the Update Contextual Metadata block and set custom metadata fields like caption or alt. These fields will be added to the video as soon as it’s uploaded.

The Contextual Metadata block. — Update Contextual Metadata block

Step 3: Auto-Generate Tags Using Google AI

Now, scroll to the Add-ons section on the left. Drag the Google Video Tagging block to the canvas and connect it to the Update Contextual Metadata block.

Google Video Tagging block in MediaFlows. — Google Video Tagging block in MediaFlows

Once connected, the workflow will auto-tag every video using Google AI tagging as part of the upload process.

Step 4: Test Your Setup

After setting up your blocks, click Save to store the workflow.

Now let’s test it. Click the Settings icon beside the Test button to configure which sample asset to upload. Use a video URL as your test asset.

In Flow settings, you can configure your test video in MediaFlows. The video still shows Earth from space. — Configure test video in MediaFlows

Once the test video is set, click Test. This will trigger the full workflow using that video.

Success message after testing MediaFlow for Cloudinary Upload, Update Contextual Metadata, and Google Video Tagging. — Success message after testing MediaFlow

If everything’s working, you’ll see success messages like the one above. If you head to your Cloudinary Media Library, you’ll find the video uploaded, tagged automatically, and captioned with the metadata you specified.

Video successfully uploaded with the caption: This is a test video. — Video successfully uploaded with caption

Wrapping Up

Making your videos SEO-friendly and accessible isn’t just good practice, it’s essential if you want them to be seen, understood, and ranked. With Cloudinary, you can automate transcripts, captions, metadata, and tagging and then scale that across hundreds of assets using MediaFlows.

Resources:

Accessible Videos Are the Next SEO Hack. Here’s How to Do It Right.

Why It Matters

How to Automate Alt Text, Tagging, and Structured Data With Cloudinary

Embedding Transcripts and Captions

Testing if Your Video SEO and Accessibility Setup Works

How to Build a Media Library That Supports SEO Performance Goals

Step 1: Detect Video Uploads

Step 2: Add Metadata to the Uploaded Video

Step 3: Auto-Generate Tags Using Google AI

Step 4: Test Your Setup

Wrapping Up

Start Using Cloudinary

Products

Solutions

Developers

Company

Contact Us

Why It Matters

Continue Reading

Start Using Cloudinary