Cloudinary Blog

Audio in Video Is Crucial. Here's How to Produce High-Quality Audio

Why Audio in Video Matters

Many content creators and consumers tend to regard video as visuals, but that’s only part of the experience. Immersive video content includes strong audio. Just like in a movie, the audio for video content comprises many components: the narrator or subjects, the background music that sets the mood and draws viewers in, sound effects, and so forth.

It’s easy to overlook audio in deference to the visuals. However, high-quality audio counts as much in short videos as it does in long productions. Let’s dig into how poor audio impacts otherwise compelling video and explore how Cloudinary helps fix the issues for a more engaging viewing experience.

Understanding the Production Problems

Audio problems are annoying. For example, with multiple clips produced at different times or by different people, the creator might neglect to level the sound, causing sound variation in compound videos.

Imagine this scenario: A viewer is a third of the way through watching a video, and suddenly the narrator’s voice turns twice louder. Or, worse, the next piece of background music jumps up a level and drowns out everything else. Such an abrupt volume change interrupts attention, makes the video feel less immersive, and might even cause the viewer to stop watching. On a streaming site like YouTube or Vimeo, you as the producer might lose views—or even receive thumbs-down ratings or nasty comments.

Other problems can result in annoying audio. Parts of the video might go quiet or become almost inaudible. Or the music is scratchy from poor quality or a low recorded bitrate.

However, a video with high audio quality might not be ideal. Why? Because you might’ve leveled all the audio but produced it at too high a quality level, which might crash machines that don’t support the audio codec. Additionally, the video quality might decline if a device’s connection or processor cannot handle the file.

Accessibility of audio matters just as much as quality and leveling since some of your audience might be deaf or hard of hearing, or they might speak a different language. Subtitles or other visual cues would be of tremendous help for them.

Producing optimal audio is challenging. Even experienced creators occasionally overlook certain details or run into obstacles.

Working With Tools

Tools can help solve audio issues. A studio-quality microphone, such as a Blue Yeti model, which is relatively inexpensive and which offers moderate-level recording control, would make a huge difference.

In addition, with premium-quality studio headphones, you can listen to the video during the production process and identify problems. Budget allowing, whole devices dedicated to audio processing and sound control are available, not to mention first-rate computers, equipment, or devices for audio production.

On the other hand, budget constraints might preclude those hardware purchases, especially at the outset. Software is a far more economical alternative, and cloud production takes the load off your machines—especially if you work with only one computer.

As one of the longest-existing streaming sites, YouTube offers rudimentary tools—but not a full suite—for video editing. Also, the tools for various production software vary. Some focus on video; others on audio, but many merely control the basic audio functions postproduction. Other postproduction tools would come in handy to beat your audio into shape.

Leveraging Cloudinary’s Postproduction Assistance

Cloudinary offers controls for both video and audio. While editing a video with Cloudinary, you can upload the audio files separately and work with several other tools with transformation capabilities similar to those in photo-editing software: clip, stretch, and so on. Even if encoded, those tools work directly on audio and video.

Plus, by uploading and hosting videos with Cloudinary, you can apply transformations through APIs, which support services of all kinds. Cloudinary even comes with a video player.

The next section describes a few transformations as examples. Feel free to use some of Cloudinary’s example videos or upload your own audio and video. Before you start, sign up for a free Cloudinary account.

Transforming Audio

video upload

Here’s a demo of a simple transformation of a video from the Cloudinary Media Library. Follow these steps:

  1. Double-click recipes and choose one of the four video options. After loading the video, click Transform to go to the video’s Transform page, where you can resize, crop, format, and edit videos on the fly. You can also add special effects.

  2. Scroll down to Audio Codec and click No Audio to remove the audio from a video in order to overlay another version. A Refresh button is then deployed on the demo player.

    video transforming

  3. Click Refresh to preview the change.

    The code line below the player will have changed, and you can now download the edited video or post it as is on a website. If you’re using JavaScript or another framework or language, you can derive code to generate a player for it. See this example with React:

    Copy to clipboard
    <Video publicId="recipes/asltranslation" >
    <Transformation audioCodec="none" />

Other controls are also available, e.g., you can shift the audio frequency or change the codec to other formats that perform better on other systems. (As mentioned earlier, too good a sample or an unknown codec might cause crashes.) Besides, you can chain transitions for multiple edits.

Diving Deeper Into the Flow

To correct or edit audio directly, use Cloudinary’s MediaFlows system with which you can custom-build a video editor with a block-type programming interface and different features per block.


MediaFlows is in Beta, requiring a separate login after registration on Google or GitHub.

For sound enhancements, Cloudinary has worked with to build the Media Enhancement block. To enable that block, contact Cloudinary Support. Also, given that the block’s features are advanced, they require an additional API key.

Afterwards, you can use to transform the videos within your MediaFlows app. A new block is displayed, in which you can edit the volume, reduce the noise level, isolate a speech, or apply speech-leveling effects to fine-tune the video’s audio quality.

Try This MediaFlow Today!
Ready to try MediaFlows for yourself? Check out “Enhancing Audio for Video using Media Enhance API”.

Capitalizing on a Cloudinary Add-On

While working with an app, Cloudinary’s add-ons render your videos accessible. For instance, if you’ve built a custom uploader with Cloudinary, you can leverage a transcription tool through the Google AI Video Transcription Add-On by calling that tool through code with the Cloudinary API, just as you do with video transformations.

A case in point: When uploading a video through your app, you can chain the video to the Cloudinary API with the call below, which is programmed for the Node.js API, or with others that are slated for various languages or frameworks.

Copy to clipboard
  { resource_type: "video",
    raw_convert: "google_speech" },
  function(error, result) {console.log(result, error) });

Cloudinary and the Video Transcription tool transcribe the video in the language you specify. You can then turn the transcription into captions and configure Cloudinary to link to other add-ons for a more accessible video for wider audiences.

Wrapping Up the Track

Because the quality and content of audio can enhance or destroy video, it’s just as crucial as video. A critical task is to ensure that your audio timing is on track.

Even though you can fix most audio problems with the correct tools, you need more help at times, especially if you’re working as a single developer. Give Cloudinary a try to see (and hear) how it can help you attain the video feel you aim for and reach wider audiences. Cloudinary also works with other services, boosting the range of features for managing audio.

Recent Blog Posts

Partner news: Cloudinary-Getty Images Integration

Supported by intelligent automation, Cloudinary serves as an effective conduit between media asset management and delivery so you can take maximum advantage of assets, compress workflows, and build and coordinate engaging and inspiring customer experiences. Through Cloudinary’s Digital Asset Management (DAM) solution, which employs the company’s innovative image and video APIs, creative and marketing teams can benefit from them, as well as from many AI-powered and automated capabilities. As a result, you can transform, optimize, and deliver media at scale on an intuitive UI.

Read more

For Developers: the HTML <picture> Element Explained

By Amarachi Amaechi
For Developers: the HTML <picture> Element Explained

We all know the good ol', tireless <img> element, which has been a long-time go-to for inserting graphics into webpages. Time doesn’t stop, however, and neither do technological advancements. So, let’s get you up to speed with the element’s modern alternative: the <picture> element.

Read more
Create Lightweight Sites With Low-Code and No-Code Technology

Consumers expect modern websites to be mainly visual. But, the more compelling and complex the related media is, the more data is involved, compounding the site’s weight. In today’s content-craving world, delivering unoptimized media can cost you because it leads to sluggish page loads, resulting in visitors abandoning your site in search of a faster alternative. In fact, a page load that takes more than three seconds can cause as many as 40% of your visitors to bounce. Given this competitive, digital-first environment, you can’t afford to lose page views, for time is of the essence.

Read more