Microsoft Azure Video Indexer

Cloudinary is a cloud-based service that provides an end-to-end image and video management solution including uploads, storage, manipulations, optimizations and delivery. Cloudinary's video solution includes a rich set of video manipulation capabilities, including cropping, overlays, optimizations, and a large variety of special effects.

The Microsoft Azure Video Indexer add-on integrates Microsoft Azure's automatic video indexing capabilities with Cloudinary's complete video management and manipulation pipeline.

Cloudinary has currently integrated the following Microsoft Azure Video Indexing services:

  • Video categorization: Identifies visual objects, brands and actions displayed, and automatically identifies over 1 million celebrities. Extends Cloudinary's powerful semantic data extraction and tagging features, so that your videos can be automatically tagged according to the automatically detected categories and tags in each video.
  • Video transcription: Automatically generate speech-to-text transcripts of videos that you or your users upload to your account. The add-on supports transcribing videos in almost any language. You can parse the contents of the returned transcript file to display the transcript of your video on your page, making your content more skimmable, accessible, and SEO-friendly.

Video categorization

Take a look at the following video of a dog called jack:

Ruby:
cl_video_tag("jack")
PHP:
cl_video_tag("jack")
Python:
CloudinaryVideo("jack").video()
Node.js:
cloudinary.video("jack")
Java:
cloudinary.url().videoTag("jack");
JS:
cloudinary.videoTag('jack').toHtml();
jQuery:
$.cloudinary.video("jack")
React:
<Video publicId="jack" >

</Video>
Vue.js:
<cld-video publicId="jack" >

</cld-video>
Angular:
<cl-video public-id="jack" >

</cl-video>
.Net:
cloudinary.Api.UrlVideoUp.BuildVideoTag("jack")
Android:
MediaManager.get().url().resourceType("video").generate("jack.mp4");
iOS:
cloudinary.createUrl().setResourceType("video").generate("jack.mp4")

By setting the categorization parameter to azure_video_indexer when calling Cloudinary's upload or update method, Microsoft is used to automatically classify the scenes of the uploaded or specified existing video. For example:

Ruby:
Cloudinary::Uploader.upload("jack.mp4", 
  :resource_type => :video, 
  :categorization => "azure_video_indexer")
PHP:
\Cloudinary\Uploader::upload("jack.mp4", 
  array(
    "resource_type" => "video", 
    "categorization" => "azure_video_indexer"));
Python:
cloudinary.uploader.upload("jack.mp4",
  resource_type = "video", 
  categorization = "azure_video_indexer")
Node.js:
cloudinary.v2.uploader.upload("jack.mp4", 
  { resource_type: "video", 
    categorization: "azure_video_indexer" },
  function(error, result) {console.log(result, error) });
Java:
cloudinary.uploader().upload("jack.mp4", 
  ObjectUtils.asMap(
    "resource_type", "video", 
    "categorization", "azure_video_indexer"));
.Net:
var uploadParams = new VideoUploadParams() 
{
  File = new FileDescription(@"jack.mp4"),
  Categorization = "azure_video_indexer"
};
var uploadResult = cloudinary.Upload(uploadParams);

Tips

  • You can use upload presets to centrally define a set of upload options including add-on operations to apply, instead of specifying them in each upload call. You can define multiple upload presets, and apply different presets in different upload scenarios. You can create new upload presets in the Upload page of the Management Console settings or using the upload_presets Admin API method. From the Upload page of the console settings, you can also select default upload presets to use for image, video, and raw API uploads (respectively) as well as default presets for image, video, and raw uploads performed via the Media Library UI.
  • You can run multiple categorization add-ons on the resource. The categorization parameter accepts a comma-separated list of all the Cloudinary categorization add-ons to run on the resource.

The video analysis and categorization is performed asynchronously after the method call is completed.

Note
The amount of time it takes for analysis and categorization of the video depends on the size and length of the video file itself. You can include a notification_url parameter in your request to get a notification to the requested URL when the categorization is ready.

The response of the upload method indicates that the process is in pending status.

{ 
  ...
  "info": { 
    "categorization": { 
      "azure_video_indexer": { 
        "status": "pending" 
      }
    }
  },
  ...
}

Once the categorization process completes, the information is returned to Cloudinary and stored with your video. The details of the analysis and categorization are also sent to the notification_url if this option was included with your method call. For example:

{
  "info_kind": "azure_video_indexer",
  "info_status": "complete",
  "public_id": "jack",
  "uploaded_at": "2019-02-25T10:35:25Z",
  "version": 1551090925,
  "url": "http://res.cloudinary.com/demo/video/upload/v1551090925/jack.mp4",
  "secure_url": "https://res.cloudinary.com/demo/video/upload/v1551090925/jack.mp4",
  "etag": "58caa750581129cdca6ac4496653e186",
  "notification_type": "info",
  "info_data": {
    "en-US": {
      "labels": [
        {
          "id": 0,
          "name": "brown",
          "language": "en-US",
          "instances": [
            {
              "confidence": 0.9379,
              "adjustedStart": "0:00:00",
              "adjustedEnd": "0:00:13.413",
              "start": "0:00:00",
              "end": "0:00:13.413",
              "duration": "0:00:13.413"
            }
          ]
        },
        {
          "id": 1,
          "name": "dog",
          "language": "en-US",
          "instances": [
            {
              "confidence": 1,
              "adjustedStart": "0:00:00",
              "adjustedEnd": "0:00:13.413",
              "start": "0:00:00",
              "end": "0:00:13.413",
              "duration": "0:00:13.413"
            }
          ]
        },
        {
    ...
}

The information includes the automatic tagging and categorization information identified by the Microsoft Azure Video Indexer add-on. As can be seen in the example snippet above, various labels (tags) were automatically detected in the uploaded video. Each label is listed together with other information including the start and end times of the relevant video segment. The confidence score is a numerical value that represents the confidence level of the detected label, where 1.0 means 100% confidence.

Note
The entire JSON response from Microsoft is included: for detailed information on the response structure see the Microsoft Video Indexer output.

Automatically adding tags to videos

Automatically categorizing your videos is a useful way to organize your Cloudinary media library. By providing the auto_tagging parameter in an upload or update call for any video where azure_video_indexer was run, the video is automatically assigned tags based on the detected scene labels, brands and celebrity faces. The value of the auto_tagging parameter is the minimum confidence score to be automatically used as an assigned resource tag. Assigning these resource tags allows you to list and search videos in your media library using Cloudinary's API or Web interface.

The following code example automatically tags an uploaded video with all detected scene labels, brands and celebrity faces that have a confidence score higher than 0.6.

Ruby:
Cloudinary::Uploader.upload("jack.mp4", 
  :resource_type => :video, 
  :categorization => "azure_video_indexer", 
  :auto_tagging => 0.6)
PHP:
\Cloudinary\Uploader::upload("jack.mp4", 
  array(
    "resource_type" => "video",
    "categorization" => "azure_video_indexer", 
    "auto_tagging" => 0.6));
Python:
cloudinary.uploader.upload("jack.mp4",
  resource_type = "video",
  categorization = "azure_video_indexer", 
  auto_tagging = 0.6)
Node.js:
cloudinary.v2.uploader.upload("jack.mp4", 
  { resource_type: "video", 
    categorization: "azure_video_indexer", 
    auto_tagging: 0.6 },
  function(error, result) {console.log(result, error) });
Java:
cloudinary.uploader().upload("jack.mp4", 
  ObjectUtils.asMap(
    "resource_type", "video", 
    "categorization", "azure_video_indexer", 
    "auto_tagging", "0.6"));
.Net:
var uploadParams = new VideoUploadParams() 
{
  File = new FileDescription(@"jack.mp4"),
  Categorization = "azure_video_indexer",
  AutoTagging = 0.6
};
var uploadResult = cloudinary.Upload(uploadParams);

The response of the upload call above returns the detected categorization as well as automatically assigning tags. In this case:

{ 
  ...    
  "tags": [ "brown", "dog", "grass", "looking", "sitting", "animal", "white", "mammal" ]   
  ...

Tagging uploaded videos

You can also use the update method to apply auto tagging to already uploaded videos, based on their public IDs, and then automatically tag them according to the detected categories.

For example, the following video was uploaded to Cloudinary with the 'horses' public ID:

Ruby:
cl_video_tag("horses")
PHP:
cl_video_tag("horses")
Python:
CloudinaryVideo("horses").video()
Node.js:
cloudinary.video("horses")
Java:
cloudinary.url().videoTag("horses");
JS:
cloudinary.videoTag('horses').toHtml();
jQuery:
$.cloudinary.video("horses")
React:
<Video publicId="horses" >

</Video>
Vue.js:
<cld-video publicId="horses" >

</cld-video>
Angular:
<cl-video public-id="horses" >

</cl-video>
.Net:
cloudinary.Api.UrlVideoUp.BuildVideoTag("horses")
Android:
MediaManager.get().url().resourceType("video").generate("horses.mp4");
iOS:
cloudinary.createUrl().setResourceType("video").generate("horses.mp4")

The following code sample uses Cloudinary's update method to apply automatic video tagging and categorization to the sample uploaded video, and then automatically assign resource tags based on the categories detected with over a 60% confidence level.

Ruby:
Cloudinary::Api.update("sample", 
   :resource_type => :video, 
   :categorization => "azure_video_indexer", 
   :auto_tagging => 0.6)
PHP:
\Cloudinary\Api::update("sample", 
  array(
    "resource_type" => "video",
    "categorization" => "azure_video_indexer", 
    "auto_tagging" => 0.6));
Python:
cloudinary.api.update("sample",
  resource_type = "video",
  categorization = "azure_video_indexer", 
  auto_tagging = 0.6)
Node.js:
cloudinary.v2.uploader.update("sample", 
  { resource_type: "video", 
    categorization: "azure_video_indexer", 
    auto_tagging: 0.4 },
  function(error, result) {console.log(result, error) });
Java:
cloudinary.api().update("sample", ObjectUtils.asMap(
  "resource_type": "video",
  "categorization", "azure_video_indexer", 
  "auto_tagging", 0.6));
.Net:
var updateParams = new UpdateParams("sample") 
{
  Categorization = "azure_video_indexer",
  ResourceType = "video",
  AutoTagging = 0.6
};
var updateResult = cloudinary.UpdateResource(updateParams);

Tip
Whether or not you included a notification_url to get a response om the analysis, you can always use the Admin API's resource method to return the details of a resource, including the categorization that you already extracted using the upload or update methods.

Video transcription

To request a transcript for a video or audio file (in the default US English language), include the raw_convert parameter with the value azure_video_indexer in your upload or update call. (For other languages, see transcription languages below.)

For example, to request transcription on the introduction to a video tutorial on folder permissions (see the full tutorial here):

Ruby:
Cloudinary::Uploader.upload("folder-permissions-tutorial.mp4", 
  :resource_type => :video, 
  :raw_convert => "azure_video_indexer")
PHP:
\Cloudinary\Uploader::upload("folder-permissions-tutorial.mp4", 
  array(
    "resource_type" => "video", 
    "raw_convert" => "azure_video_indexer"));
Python:
cloudinary.uploader.upload("folder-permissions-tutorial.mp4",
  resource_type = "video", 
  raw_convert = "azure_video_indexer")
Node.js:
cloudinary.v2.uploader.upload("folder-permissions-tutorial.mp4", 
  { resource_type: "video", 
    raw_convert: "azure_video_indexer" }),
  function(error, result) {console.log(result, error) });
Java:
cloudinary.uploader().upload("folder-permissions-tutorial.mp4", 
  ObjectUtils.asMap(
    "resource_type", "video", 
    "raw_convert", "azure_video_indexer"));
.Net:
var uploadParams = new VideoUploadParams()
{
  File = new FileDescription(@"folder-permissions-tutorial.mp4"),
  RawConvert = "azure_video_indexer"
};
var uploadResult = cloudinary.Upload(uploadParams);

Tip
You can use upload presets to centrally define a set of upload options including add-on operations to apply, instead of specifying them in each upload call. You can define multiple upload presets, and apply different presets in different upload scenarios. You can create new upload presets in the Upload page of the Management Console settings or using the upload_presets Admin API method. From the Upload page of the console settings, you can also select default upload presets to use for image, video, and raw API uploads (respectively) as well as default presets for image, video, and raw uploads performed via the Media Library UI.

The azure_video_indexer parameter value activates a call to the Microsoft Azure Video Indexer API, which is performed asynchronously after your original method call is completed. Thus your original method call response displays a pending status:

...
"info": {   
   "raw_convert": {
      "azure_video_indexer": {
        "status": "pending"
      }
    }
 }
...

When the azure_video_indexer asynchronous request is complete (depending on the length of the video), a new raw file is created in your account with the same public ID as your video or audio file and with the en-us.azure.transcript file extension. You can additionally request a standard subtitle format such as 'vtt' or 'srt'.

If you also provided a notification_url in your method call, the specified URL then receives a notification when the process completes:

{
  "info_kind":"azure_video_indexer",
  "info_status":"complete",
  "public_id":"folder-permissions-tutorial",
  ...
}

Transcription languages

If your video/audio file is in a language other than US English, you can request transcription in the relevant language by adding the language code to the raw_convert value (e.g., azure_video_indexer:fr-FR). The resulting transcript file will also include the language code in the name ({public_id}.{lang-code}.azure.transcript).

For example, to request a video transcript in French when uploading the video Paris.mp4:

Ruby:
Cloudinary::Uploader.upload("Paris.mp4", 
  :resource_type => :video, 
  :raw_convert => "azure_video_indexer:fr-FR")
PHP:
\Cloudinary\Uploader::upload("Paris.mp4", 
  array(
    "resource_type" => "video", 
    "raw_convert" => "azure_video_indexer:fr-FR"));
Python:
cloudinary.uploader.upload("Paris.mp4",
  resource_type = "video", 
  raw_convert = "azure_video_indexer:fr-FR")
Node.js:
cloudinary.v2.uploader.upload("Paris.mp4", 
  { resource_type: "video", 
    raw_convert: "azure_video_indexer:fr-FR" }),
  function(error, result) {console.log(result, error) });
Java:
cloudinary.uploader().upload("Paris.mp4", 
  ObjectUtils.asMap(
    "resource_type", "video", 
    "raw_convert", "azure_video_indexer:fr-FR"));
.Net:
var uploadParams = new VideoUploadParams()
{
  File = new FileDescription(@"Paris.mp4"),
  RawConvert = "azure_video_indexer:fr-FR"
};
var uploadResult = cloudinary.Upload(uploadParams);

For a full list of supported language and region codes, see the Azure Video Indexer language options.

Cloudinary transcript files

The created .transcript file includes details of the audio transcription, for example:

[
  {
    "confidence":0.9312,
    "transcript":"Many organizations that use cloudinary have thousands if not millions of digital assets stored in their system.",
    "start_time":7.11,
    "end_time":15.1
  },
  {
    "confidence":0.9312,
    "transcript":"However, certain users on the account may not need to access all the images videos and raw files. Because of this cloudinary offers the ability to add a layer of",
    "start_time":15.1,
    "end_time":26.59
  },
 ...
]

Each excerpt of text has a confidence value, and a breakdown of specific start and end times.

Subtitle length and confidence levels

Microsoft returns transcript excerpts of varying lengths. When displaying subtitles, long excerpts are automatically divided into 20 word entities and displayed on two lines.

You can also optionally set a minimum confidence level for your subtitles, for example: l_subtitles:my-video-id.en-us.azure.transcript:90. In this case, any excerpt that Microsoft returns with a lower confidence value will be omitted from the subtitles. Keep in mind that in some cases, this may exclude several sentences at once.

Generating standard subtitle formats

If you want to include the transcript as a separate track for a video player, you can also request that cloudinary create an SRT and/or WebVTT raw file by including the srt and/or vtt qualifiers (separated by a colon) with the azure_video_indexer value. For example, to upload a video and also request both srt and vrt files with the transcript:

Ruby:
Cloudinary::Uploader.upload("folder-permissions-tutorial.mp4", 
  :resource_type => :video, 
  :raw_convert => "azure_video_indexer:srt:vtt")
PHP:
\Cloudinary\Uploader::upload("folder-permissions-tutorial.mp4", 
  array(
    "resource_type" => "video", 
    "raw_convert" => "azure_video_indexer:srt:vtt"));
Python:
cloudinary.uploader.upload("folder-permissions-tutorial.mp4",
  resource_type = "video", 
  raw_convert = "azure_video_indexer:srt:vtt")
Node.js:
cloudinary.v2.uploader.upload("folder-permissions-tutorial.mp4",
  { resource_type: "video", 
    raw_convert: "azure_video_indexer:srt:vtt" }),
  function(error, result) {console.log(result, error) });
Java:
cloudinary.uploader().upload("folder-permissions-tutorial.mp4", 
  ObjectUtils.asMap(
    "resource_type", "video", 
    "raw_convert", "azure_video_indexer:srt:vtt"));
.Net:
var uploadParams = new VideoUploadParams()
{
  File = new FileDescription(@"folder-permissions-tutorial.mp4"),
  RawConvert = "azure_video_indexer:srt:vtt"
};
var uploadResult = cloudinary.Upload(uploadParams);

When the request completes, there will be 4 files associated with the uploaded video in your account:

.../video/upload/folder-permissions-tutorial.mp4    // the source video
.../raw/upload/folder-permissions-tutorial.en-us.azure.transcript
.../raw/upload/folder-permissions-tutorial.en-us.azure.srt
.../raw/upload/folder-permissions-tutorial.en-us.azure.vtt

Notes

  • If you also specify a language in the azure_video_indexer transcript request:
    - the request for format must be given before the language (e.g., azure_video_indexer:srt:vtt:fr-FR)
    - the generated files will include the language and region code in the generated filename (e.g., folder-permissions-tutorial.fr-FR.azure.transcript.vtt).
  • No speech recognition tool is 100% accurate. If exact accuracy is important for your video, you can download the generated .transcript, .srt or .vtt file, edit them manually, and re-upload them (overwriting the original files).

Displaying transcripts as subtitle overlays

Cloudinary can automatically generate subtitles from the returned transcripts. To automatically embed subtitles with your video, add the subtitles property of the overlay parameter (l_subtitles in URLs), followed by the public ID to the raw transcript file (including the extension).

For example, the following URL delivers the video with automatically generated subtitles:

Ruby:
cl_video_tag("folder-permissions-tutorial", :overlay=>{:public_id=>"folder-permissions-tutorial.en-us.azure.transcript"})
PHP:
cl_video_tag("folder-permissions-tutorial", array("overlay"=>array("public_id"=>"folder-permissions-tutorial.en-us.azure.transcript")))
Python:
CloudinaryVideo("folder-permissions-tutorial").video(overlay={'public_id': "folder-permissions-tutorial.en-us.azure.transcript"})
Node.js:
cloudinary.video("folder-permissions-tutorial", {overlay: {public_id: "folder-permissions-tutorial.en-us.azure.transcript"}})
Java:
cloudinary.url().transformation(new Transformation().overlay(new SubtitlesLayer().publicId("folder-permissions-tutorial.en-us.azure.transcript"))).videoTag("folder-permissions-tutorial");
JS:
cloudinary.videoTag('folder-permissions-tutorial', {overlay: new cloudinary.SubtitlesLayer().publicId("folder-permissions-tutorial.en-us.azure.transcript")}).toHtml();
jQuery:
$.cloudinary.video("folder-permissions-tutorial", {overlay: new cloudinary.SubtitlesLayer().publicId("folder-permissions-tutorial.en-us.azure.transcript")})
React:
<Video publicId="folder-permissions-tutorial" >
  <Transformation overlay={{publicId: "folder-permissions-tutorial.en-us.azure.transcript"}} />
</Video>
Vue.js:
<cld-video publicId="folder-permissions-tutorial" >
  <cld-transformation overlay={{publicId: "folder-permissions-tutorial.en-us.azure.transcript"}} />
</cld-video>
Angular:
<cl-video public-id="folder-permissions-tutorial" >
  <cl-transformation overlay="subtitles:folder-permissions-tutorial.en-us.azure.transcript">
  </cl-transformation>
</cl-video>
.Net:
cloudinary.Api.UrlVideoUp.Transform(new Transformation().Overlay(new SubtitlesLayer().PublicId("folder-permissions-tutorial.en-us.azure.transcript"))).BuildVideoTag("folder-permissions-tutorial")
Android:
MediaManager.get().url().transformation(new Transformation().overlay(new SubtitlesLayer().publicId("folder-permissions-tutorial.en-us.azure.transcript"))).resourceType("video").generate("folder-permissions-tutorial.mp4");
iOS:
cloudinary.createUrl().setResourceType("video").setTransformation(CLDTransformation().setOverlay("subtitles:folder-permissions-tutorial.en-us.azure.transcript")).generate("folder-permissions-tutorial.mp4")

As with any subtitle overlay, you can use transformation parameters to make a variety of formatting adjustments when you overlay an automatically generated transcript file, including choice of font, font size, fill, outline color, and gravity.

For example, these subtitles are displayed using the Times font, size 20, in a blue color, and located on the top of the screen (north):

Ruby:
cl_video_tag("folder-permissions-tutorial", :overlay=>{:font_family=>"times", :font_size=>20, :public_id=>"folder-permissions-tutorial.en-us.azure.transcript"}, :color=>"blue", :gravity=>"north")
PHP:
cl_video_tag("folder-permissions-tutorial", array("overlay"=>array("font_family"=>"times", "font_size"=>20, "public_id"=>"folder-permissions-tutorial.en-us.azure.transcript"), "color"=>"blue", "gravity"=>"north"))
Python:
CloudinaryVideo("folder-permissions-tutorial").video(overlay={'font_family': "times", 'font_size': 20, 'public_id': "folder-permissions-tutorial.en-us.azure.transcript"}, color="blue", gravity="north")
Node.js:
cloudinary.video("folder-permissions-tutorial", {overlay: {font_family: "times", font_size: 20, public_id: "folder-permissions-tutorial.en-us.azure.transcript"}, color: "blue", gravity: "north"})
Java:
cloudinary.url().transformation(new Transformation().overlay(new SubtitlesLayer().fontFamily("times").fontSize(20).publicId("folder-permissions-tutorial.en-us.azure.transcript")).color("blue").gravity("north")).videoTag("folder-permissions-tutorial");
JS:
cloudinary.videoTag('folder-permissions-tutorial', {overlay: new cloudinary.SubtitlesLayer().fontFamily("times").fontSize(20).publicId("folder-permissions-tutorial.en-us.azure.transcript"), color: "blue", gravity: "north"}).toHtml();
jQuery:
$.cloudinary.video("folder-permissions-tutorial", {overlay: new cloudinary.SubtitlesLayer().fontFamily("times").fontSize(20).publicId("folder-permissions-tutorial.en-us.azure.transcript"), color: "blue", gravity: "north"})
React:
<Video publicId="folder-permissions-tutorial" >
  <Transformation overlay={{fontFamily: "times", fontSize: 20, publicId: "folder-permissions-tutorial.en-us.azure.transcript"}} color="blue" gravity="north" />
</Video>
Vue.js:
<cld-video publicId="folder-permissions-tutorial" >
  <cld-transformation overlay={{fontFamily: "times", fontSize: 20, publicId: "folder-permissions-tutorial.en-us.azure.transcript"}} color="blue" gravity="north" />
</cld-video>
Angular:
<cl-video public-id="folder-permissions-tutorial" >
  <cl-transformation overlay="subtitles:times_20:folder-permissions-tutorial.en-us.azure.transcript" color="blue" gravity="north">
  </cl-transformation>
</cl-video>
.Net:
cloudinary.Api.UrlVideoUp.Transform(new Transformation().Overlay(new SubtitlesLayer().FontFamily("times").FontSize(20).PublicId("folder-permissions-tutorial.en-us.azure.transcript")).Color("blue").Gravity("north")).BuildVideoTag("folder-permissions-tutorial")
Android:
MediaManager.get().url().transformation(new Transformation().overlay(new SubtitlesLayer().fontFamily("times").fontSize(20).publicId("folder-permissions-tutorial.en-us.azure.transcript")).color("blue").gravity("north")).resourceType("video").generate("folder-permissions-tutorial.mp4");
iOS:
cloudinary.createUrl().setResourceType("video").setTransformation(CLDTransformation().setOverlay("subtitles:times_20:folder-permissions-tutorial.en-us.azure.transcript").setColor("blue").setGravity("north")).generate("folder-permissions-tutorial.mp4")

Displaying transcripts as a separate track

Instead of embedding a transcript in your video as an overlay, you can alternatively add returned vtt or srt transcript files as a separate track for a video player. This way, the subtitles can be controlled (toggled on/off) separately from the video itself. For example, to add the video and transcript sources for an HTML5 video player:

<video crossorigin autobuffer controls muted 
  poster="https://res.cloudinary.com/demo/video/upload/w_800/folder-permissions-tutorial.jpg" >
     <source id="mp4" src="https://res.cloudinary.com/demo/video/upload/w_800/folder-permissions-tutorial.mp4" type="video/mp4">
     <track label="English" kind="subtitles" srclang="en" src="https://res.cloudinary.com/demo/raw/upload/folder-permissions-tutorial.en-us.azure.vtt" default>
</video>

Note
If you're using the Cloudinary video player, you can add subtitles and captions as a separate text track by using the textTracks parameter.