Skip to content

Video Subtitle Translation in Cloudinary With Azure

Video captions and subtitles have evolved from an accessibility feature to a universal tool for user engagement. A whopping 80% of viewers are more likely to watch a video from beginning to end if captions are available. According to that same study, 69% of viewers keep their sound off in public spaces, while 50% prefer to watch on mute regardless of where they are.

Needless to say, adding captions and subtitles will expand your video’s reach. This blog post shows you how to automatically generate and translate subtitles using Cloudinary and Microsoft Azure Video Indexer in a Next.js app.

Here’s a preview of the demo, where users can view video content with English and Spanish subtitles.

If you don’t have a Next.js project already, create one by running the following command:

npx create-next-app@14 subtitle-translation-demo

cd subtitle-translation-demoCode language: CSS (css)

Next, install the Cloudinary Node.js SDK, which allows you to interact with Cloudinary’s API directly from your app:

<code>npm install cloudinary</code>Code language: HTML, XML (xml)

Once this package is installed, connect your app to Cloudinary using your Cloudinary API credentials. Navigate to your Cloudinary dashboard and copy your cloud name, API key, and API secret.

Cloudinary API keys from its dashboard.

Create a .env.local file in the root of your project to securely add your Cloudinary credentials:

NEXT_PUBLIC_CLOUDINARY_CLOUD_NAME=your_cloud_name

NEXT_PUBLIC_CLOUDINARY_API_KEY=your_api_key

CLOUDINARY_API_SECRET=your_api_secret

The next step is to initialize Cloudinary by creating a lib  folder in the root of your project and a cloudinary.ts file in the folder to configure Cloudinary:

// lib/cloudinary.ts

import { v2 as cloudinary } from 'cloudinary';

cloudinary.config({

  cloud_name: process.env.CLOUDINARY_CLOUD_NAME,

  api_key: process.env.CLOUDINARY_API_KEY,

  api_secret: process.env.CLOUDINARY_API_SECRET,

  secure: true,

});

export default cloudinary;Code language: JavaScript (javascript)

This setup allows you to easily interact with Cloudinary’s API throughout your app.

With Cloudinary set up, you can now upload videos using server actions, which helps streamline the process of handling video uploads and optimization directly on the server side, keeping the logic cleaner and more efficient. 

Start by creating a form that allows users to select and upload a video. You can include the following form in your page.tsx file, and a loading state, which is used to track the upload process.

// app/page.tsx

'use client';

import { useState } from 'react';

export default function Home() {

  const [loading, setLoading] = useState(false);

  const handleSubmit = async (event: React.FormEvent<HTMLFormElement>) => {

    event.preventDefault();

    setLoading(true);

    const formData = new FormData(event.currentTarget); // Retrieve the video file

    try {

      // Upload video function will go here

    } catch (error) {

      console.error('Upload failed:', error);

    } finally {

      setLoading(false);

    }

  };

  return (

    <div>

      <h1>Upload Your Video</h1>

      <form onSubmit={handleSubmit}>

        <input type="file" name="video" accept="video/*" required />

        <button type="submit" disabled={loading}>

          {loading ? 'Uploading...' : 'Upload Video'}

        </button>

      </form>

    </div>

  );

}Code language: JavaScript (javascript)

Next, create the server action responsible for handling video uploads. This action processes the video and uploads it directly to Cloudinary.

Create an upload.ts file in the app directory of your project and add the following code:

// app/upload.ts

'use server';

import cloudinary from '../lib/cloudinary';

export async function upload(formData: FormData) {

  const file = formData.get('video') as File;

  const buffer: Buffer = Buffer.from(await file.arrayBuffer()); // Convert the video to a buffer

  // Ensure the public ID is safe for URLs

  const safePublicId = file.name.replace(/[^a-zA-Z0-9-_]/g, '_');

  const uploadResponse = await new Promise<{ secure_url: string; public_id: string }>(

    (resolve, reject) => {

      cloudinary.uploader

        .upload_stream(

          {

            resource_type: 'video',

            public_id: safePublicId,

          },

          (error, result) => {

            if (error) {

              reject(`Upload failed: ${error.message}`);

            } else {

              resolve(result);

            }

          }

        )

        .end(buffer); // Stream the video file as a buffer to Cloudinary

    }

  );

  // Return the Cloudinary video URL and public ID for further processing

  return {

    originalUrl: uploadResponse.secure_url,

    videoId: uploadResponse.public_id,

  };

}Code language: JavaScript (javascript)

In the code above:

  • You retrieve the video file from the form (formData), convert it to a buffer for streaming, and sanitize the file name to create a safe public ID.
  • The video is uploaded to Cloudinary using cloudinary.uploader.upload_stream.
  • Once uploaded, the secure URL (secure_url) and public ID (public_id) of the video are returned, which will be used later to display or process the video.

Now that the server action is in place, let’s update the form handler to trigger the upload function when the form is submitted.

Inside the handleSubmit function in page.tsx, import the upload function from the upload.ts file, and call it like this:

// app/page.tsx

import { upload } from './upload';

try {

  const result = await upload(formData); // Trigger video upload to Cloudinary

  console.log('Video URL:', result.originalUrl); // Log or display the uploaded video URL

} catch (error) {

  console.error('Upload failed', error);

} finally {

  setLoading(false);

}Code language: JavaScript (javascript)

When users submit the form, their video is uploaded to Cloudinary, and the URL is available to view in your Cloudinary media library.

With your video uploaded to Cloudinary, you can leverage Microsoft Azure Video Indexer to automatically generate subtitles and translate them into different languages. 

The integration between Cloudinary and Azure allows for seamless and efficient transcription and translation. Here is how to use the Microsoft Azure Video Indexer:

To start, you must enable the Azure Video Indexer add-on in your Cloudinary account. Head over to the Add-ons section from the Cloudinary dashboard, search for the Microsoft Azure Video Indexer, and click it.

Cloudinary dashboard to access Microsoft Azure Video Indexer

Once there, you can select a subscription plan. For this demo, the free plan (which gives you 30 units per month) will be enough. If you need more, you can choose a paid plan.

Subscription plan selected

Once the Azure Video Indexer is set up, you can request automatic subtitles for your video. Azure Video Indexer transcribes the audio in your video into text.

To do this, include the raw_convert parameter in your Cloudinary upload call in the upload.ts server action like this:

// app/upload.ts

const uploadResponse = await new Promise<{ secure_url: string; public_id: string }>(

  (resolve, reject) => {

    cloudinary.uploader.upload_stream(

      {

        resource_type: 'video',

        public_id: safePublicId,

        raw_convert: 'azure_video_indexer', // Requests the default US English transcript

      },

      (error, result) => {

        if (error) {

          reject(`Upload failed: ${error.message}`);

        } else {

          resolve(result);

        }

      }

    ).end(buffer);

  }

);Code language: JavaScript (javascript)

This azure_video_indexer parameter used in the code above triggers a call to the Azure Video Indexer API, which transcribes the video asynchronously. When the process is done, a new transcript file (e.g., en-us.azure.transcript) is created in your Cloudinary product environment.

If you want to use the transcript as subtitles in a video player, you can request specific subtitle formats like WebVTT (.vtt) or SRT (.srt). 

To do this, append the file type to the raw_convert value with a colon, like so:

<code>raw_convert: 'azure_video_indexer:vtt'</code>Code language: HTML, XML (xml)

This will generate subtitles in .vtt format, which can be added as a separate track in HTML5 video players.

By default, Azure Video Indexer assumes your video is in U.S. English. If your video is in another language, specify the source language using the raw_convert parameter. 

For example, to generate a transcript in French, you would use:

cloudinary.v2.uploader.upload(

  "my-video.mp4", 

  { resource_type: 'video', raw_convert: 'azure_video_indexer:fr-FR' }

).then(result => console.log(result));Code language: JavaScript (javascript)

You can also translate the subtitles into multiple languages by specifying additional target languages after the source language, separated by colons. For example:

<code>raw_convert: 'azure_video_indexer:fr-FR:pl-PL:he-IL:et-EE'</code>Code language: HTML, XML (xml)

This will first generate a French transcript (as the source language) and then create Polish, Hebrew, and Estonian translations. Keep in mind that you can request up to five languages (including the source language) in a single request.

If you want to generate both subtitle formats (like .vtt) and translations, specify the subtitle formats before the languages:

<code>raw_convert: 'azure_video_indexer:srt:vtt:en-US:fr-FR'</code>Code language: HTML, XML (xml)

This will generate .vtt and .srt subtitles in both U.S. English and French. The resulting files will be stored in Cloudinary with filenames like video-id.en-US.azure.transcript.vtt.

For this demo, let’s extract the English subtitles from the video and also generate Spanish translations in WebVTT (.vtt) format. This will allow you to display both the original and translated subtitles in your video player in the next section.

To achieve this, update your server action to include both the English (en-US) and Spanish (es-ES) subtitles when uploading the video. 

Here’s the code to update the upload function:

// app/upload.ts

const uploadResponse = await new Promise<{

  secure_url: string;

  public_id: string;

}>((resolve, reject) => {

  cloudinary.uploader.upload_stream(

    {

      resource_type: 'video',

      public_id: safePublicId,

      raw_convert: 'azure_video_indexer:vtt:en-US:es-ES', // Request English and Spanish subtitles in VTT format

    },

    (error, result) => {

      if (error) {

        reject(`Upload failed: ${error.message}`);

      } else {

        resolve(result);

      }

    }

  ).end(buffer);

});Code language: JavaScript (javascript)

With this update, the raw_convert parameter requests Azure Video Indexer to generate subtitles in both English and Spanish, returning them in WebVTT format (.vtt). These subtitle files will be stored in Cloudinary alongside the video and can be accessed using the video’s public ID.

In the next section, your video player will display both the original English subtitles and the translated Spanish subtitles.

For this demo, let’s display two videos side by side. One video will show the original video, while the other will display the video with the generated English and Spanish subtitles.

To achieve this, we retrieve the video URL and video ID upon uploading a video to Cloudinary. These values are stored in individual state variables, along with the Cloudinary cloud name from our .env.local file. The video ID and cloud name allow us to render subtitle tracks for each video.

// app/page.tsx

'use client';

import { useState } from 'react';

export default function Home() {

    const[videoUrl, setVideoUrl]= useState<string>(''); // Store video URL

    const[videoId, setVideoId]= useState<string>('');

    const[cloudName]= useState<string>(

`${process.env.NEXT_PUBLIC_CLOUDINARY_CLOUD_NAME}`

);

    // Handle video upload to Cloudinary and update state with video URL and ID

    return (

        <div className="min-h-screen flex-col items-center justify-between p-10 mt-14">

            <h1 className="text-3xl text-center pb-5 leading-snug">

                Translate Video Subtitle With <br /> Cloudinary and Azure Video Indexer

            </h1>

            <div className="flex justify-center my-10 items-center ">

                {/* Video upload form */}

            </div>

            {videoUrl && videoId && (

                <div className="flex justify-center space-x-4 mt-10">

                    <div>

                        <h2 className="text-lg font-bold text-center mb-4">

                            Uploaded Video

                        </h2>

                        <video

                            crossOrigin="anonymous"

                            controls

                            className="w-full max-w-md border-4 rounded"

                        >

                            <source id="mp4" src={videoUrl} type="video/mp4" />

                        </video>

                    </div>

                    <div>

                        <h2 className="text-lg font-bold text-center mb-4">

                            Transformed Video

                        </h2>

                        <video

                            crossOrigin="anonymous"

                            controls

                            className="w-full max-w-md border-4 rounded"

                        >

                            <source id="mp4" src={videoUrl} type="video/mp4" />

                            <track

                                label="English"

                                kind="subtitles"

                                srcLang="en"

                                src={`https://res.cloudinary.com/${cloudName}/raw/upload/${videoId}.en-US.azure.vtt`}

                                default

                            />

                            <track

                                label="Spanish"

                                kind="subtitles"

                                srcLang="es"

                                src={`https://res.cloudinary.com/${cloudName}/raw/upload/${videoId}.es-ES.azure.vtt`}

                                default

                            />

                        </video>

                    </div>

                </div>

            )}

        </div>

    );

}Code language: JavaScript (javascript)

In the code above, we place two videos side-by-side in the layout. The first video displays the original upload, while the second includes added subtitle tracks. 

The <track> elements specify English and Spanish subtitle files, which are pulled from Cloudinary using URLs that incorporate the video ID and language codes (en-US and es-ES). This setup allows viewers to switch between subtitle languages directly within the video player.

The full code for this demo is available in this GitHub repository, and you can also view the demo video here.

In this blog post, you’ve learned how to upload videos to Cloudinary, automatically generate subtitles, and translate them using Microsoft Azure Video Indexer — all within your Next.js app. By following these steps, you can now make your video content accessible to a wider audience.

If you enjoyed this post and want to discuss it more, join the Cloudinary Community forum and its associated Discord. And contact us today to learn more about Cloudinary’s powerful video API features.

Back to top

Featured Post