Automatically Generate Subtitles with Cloudinary

Generate Subtitles

As a designer or developer, one tends to notice the value of producing media usable by everyone. The availability of subtitles for video information is a crucial component of accessibility, and it’s becoming even more popular for the average user, too. Yet, manually producing subtitles can be a laborious and time-consuming operation.

But it doesn’t have to be. With a tool like Cloudinary at your disposal, you can automatically generate subtitles on the fly for all of your media. In this article, we’ll show you how you can harness the power of Cloudinary to create subtitles and why they’re important in the first place.

When and Why Should You Generate Subtitles?

Subtitles are an essential component in making your videos accessible to users. They are also rapidly becoming a preferred option for many users today, making them reach even broader audiences. Let’s take a look at some of the most common reasons why you need subtitles:

  • Enhances accessibility. By providing subtitles, you may make your video content more accessible to viewers who are hard of hearing or deaf.
  • Improved user experience. By adding clarity, context, and engagement, subtitles can enhance the overall user experience.
  • Improved engagement. Subtitles can enhance engagement by making it simpler for viewers to understand and follow along with the information.
  • Viewing in noisy or quiet environments. Subtitles enable viewers to watch videos in situations where audio is not feasible, such as in noisy public spaces or settings where silence is required.

Ways to Generate Subtitles Automatically

Generating subtitles manually can be a time-consuming and labor-intensive process. Thankfully, there are several ways to automate it to make it more efficient and cost-effective. Here are some popular methods for generating subtitles automatically:

  • AI-powered transcription services. Some transcription services leverage AI to generate subtitles with a high degree of accuracy automatically. Companies like Otter.ai, Trint, and Sonix offer AI-powered transcription tools that can quickly and efficiently convert spoken words into text, which can then be used as subtitles.
  • Video editing software with built-in subtitle generators. Some video editing software, like Adobe Premiere Pro and Final Cut Pro, offer built-in subtitle generation features. These tools can automatically transcribe the audio from your video and generate subtitles, which can then be edited and fine-tuned within the software; however, it’s a very manual process.
  • Speech-to-text software. Many speech-to-text tools, such as Google’s Speech-to-Text API or IBM Watson’s Speech to Text, can transcribe spoken words in a video into text. These tools use advanced machine learning algorithms to recognize speech patterns and convert them into written text, which can then be formatted into subtitles. Unfortunately, many of these aren’t fully accurate and can be difficult to incorporate into your workflow.

Although these programs can help create subtitles, their output might not always be the most precise or adaptable. Cloudinary offers a more dependable and adaptable way to create subtitles for your media with our automatic subtitle generation tool. So let’s show you exactly how it’s done!

How To Generate Subtitles Automatically with Cloudinary

With Cloudinary, you’re able to automatically generate subtitles whenever you need, entirely through the cloud. Our Google AI Video Transcription add-on passes your videos through Google’s powerful neural networks to generate accurate transcripts with the best possible results. Plus, you still get the full suite of the Cloudinary platform too.

Prerequisites

Before getting started with this tutorial, we’ll need a few things. Primarily, we need a Cloudinary account (which you can start with for free). We will create a Node.js script to generate subtitles from our videos automatically, so install it from the official Node page.

Next, you’ll need to install Cloudinary’s Node.js SDK to make and authenticate API calls to Cloudinary. To install this library, create a project folder in a directory of your choice, open up your terminal, and type the following command:

npm install cloudinary

Finally, we need to activate the Google AI Video Transcription service. To do this, log in to your Cloudinary account and head to the Add-on tab.

Next, search for the Google AI Video Transcription add-on:

Finally, click on the add-on and subscribe to the free plan:

With this, we are ready to make API calls to Cloudinary.

Generating Subtitles Using Google AI Video Transcription

To start setting up our Cloudinary API, we’ll be utilizing the Cloudinary Node.js SDK. This SDK will allow us to call our API and authenticate our requests, which is essential for generating accurate transcripts.

The first step is to create a project folder in the directory of your choice. Once you’ve done that, you can open up your terminal, navigate to your project directory, and run the following command:

npm install cloudinary

Now let’s define videos that we want to generate subtitles for. Open up your project’s folder and create an assets folder. Here add the video that you want to use. We will use lincoln.mp4:

Next, we need to configure Cloudinary with our account details. In your project folder, create a new file called Subtitiles.js and add start by importing the Cloudinary SDK and defining our API with your account details:

// Import the Cloudinary SDK
const cloudinary = require('cloudinary').v2;
// Configure Cloudinary with your account details
cloudinary.config({
  cloud_name: 'CLOUD_NAME',
  api_key: 'API_KEY',
  api_secret: 'API_SECRET'
});

Replace CLOUD_NAME, API_KEY, and API_SECRET with your Cloudinary credentials, which you can find in your account dashboard.

Next, we’ll upload the video and generate the transcript. Add the following code to Subtitiles.js:

// Define a public ID for the video and its transcript
const videoPublicId = 'lincoln';
const transcriptPublicId = `${videoPublicId}.transcript`;

// Upload the video to Cloudinary and convert it to text using Google Speech Recognition
cloudinary.uploader.upload('assets/lincoln.mp4', {
  public_id: videoPublicId,
  resource_type: 'video',
  raw_convert: 'google_speech'
})

Next, a public ID is defined for the video and transcript files. The transcript file’s public ID is set to the video file’s public ID with the suffix .transcript. The cloudinary.uploader.upload() method uploads the video file to Cloudinary and converts it to text using Google Speech Recognition. The raw_convert option is set to google_speech to specify that the video should be converted to text using Google’s speech recognition API. The transcription that will be generated will then be named lincoln.transcript.

Now that we have the transcript, we can add subtitles to the video. Add the following code to Subtitiles.js:

.then(result => {
  // Create a promise that adds the transcription as a subtitle overlay to the video
  return new Promise((resolve, reject) => {
    // Specify the subtitle overlay as "subtitles:public_id.transcript"
    const subtitlesOverlay = `subtitles:${transcriptPublicId}`;
    // Set the transformation options for the video, including the subtitle overlay
    const transformationOptions = [
      {overlay: subtitlesOverlay},
      {flags: "layer_apply"}
    ];
    // Generate a URL for the video with the subtitle overlay
    const videoUrl = cloudinary.url(videoPublicId, {
      resource_type: 'video',
      transformation: transformationOptions
    });
    // Resolve the promise with the video URL
    resolve(videoUrl);
  });
})
.then(result => console.log(result)) // Print the video URL to the console
.catch(error => console.error(error)); // Handle any errors that occur

Once the video has been uploaded and converted, the code creates a new promise that adds the transcription as a subtitle overlay to the video. The subtitle overlay is specified as subtitles:public_id.transcript, where public_id.transcript is the public ID for the transcript file. The transformation options for the video are set to include the subtitle overlay using the overlay and flags options.

Finally, a URL for the video with the subtitle overlay is generated using the cloudinary.url() method, and the URL is printed to the console using console.log(). Any errors that occur are handled using a catch() block. Our final code should look like:

// Import the Cloudinary SDK
const cloudinary = require('cloudinary').v2;

// Configure Cloudinary with your account details
cloudinary.config({
  cloud_name: 'CLOUD_NAME',
  api_key: 'API_KEY',
  api_secret: 'API_SECRET'
});

// Define a public ID for the video and its transcript
const videoPublicId = 'lincoln';
const transcriptPublicId = `${videoPublicId}.transcript`;

// Upload the video to Cloudinary and convert it to text using Google Speech Recognition
cloudinary.uploader.upload('assets/lincoln.mp4', {
  public_id: videoPublicId,
  resource_type: 'video',
  raw_convert: 'google_speech'
})
.then(result => {
  // Create a promise that adds the transcription as a subtitle overlay to the video
  return new Promise((resolve, reject) => {

    // Specify the subtitle overlay as "subtitles:public_id.transcript"
    const subtitlesOverlay = `subtitles:${transcriptPublicId}`;

    // Set the transformation options for the video, including the subtitle overlay
    const transformationOptions = [
      {overlay: subtitlesOverlay},
      {flags: "layer_apply"}
    ];

    // Generate a URL for the video with the subtitle overlay
    const videoUrl = cloudinary.url(videoPublicId, {
      resource_type: 'video',
      transformation: transformationOptions
    });

    // Resolve the promise with the video URL
    resolve(videoUrl);
  });
})
.then(result => console.log(result)) // Print the video URL to the console
.catch(error => console.error(error)); // Handle any errors that occur

Running the code above yields the following result:

To verify your upload, follow the URL in the terminal output or head to the Media Library tab in your Cloudinary account. If the process was successful, you’ll be able to see your video and its transcription:

Here is what our video looks like:

Make Subtitles A Breeze with Cloudinary

Designers and developers aiming to increase the accessibility of their video content will find automatic subtitle generation a game-changer. Not only does adding subtitles make your content more accessible to users, but it’s also a massive improvement for your overall user experience. Plus, with modern AI-powered tools like Cloudinary, it’s never been easier.

Cloudinary’s sophisticated algorithms and adaptable styling options make it simple to create precise and aesthetically pleasing subtitles that improve the viewing experience for all viewers.So why not try Cloudinary?

Get a free account right now to discover how automating the production of subtitles might enhance your content creation workflow.

More from Cloudinary:

Last updated: Feb 4, 2024