Cloudinary Blog

Video at Large Scale - Contributions from the Developer Community

Nicole Amsler
By Nicole Amsler
Video at Large Scale - Contributions from the Developer Community

Video is an increasingly important component for websites - whether it’s to inform visitors, enhance user experience or support sales and marketing efforts. But delivering high-quality video at large scale can be quite a challenge. You need to consider encoding, format, bandwidth usage, delivery and the devices on which visitors may be watching the video, to name just a few concerns.

To help answer questions and alleviate any confusion you may have about video at scale, we’ve polled a dozen experts, who share their best practices, tips and tricks. Keep reading to learn more from these industry thought leaders.


There are many technological features that are key to delivering smooth and successful enterprise video. It's the delivery piece that can be a tricky beast if not done right.

First, there’s the complicated web of general internet connections and content distribution networks that a video stream must travel to reach intended viewers. We've designed something we call our software-defined content delivery network (SD-CDN) that cracks the code on leveraging a multitude of distribution networks to make video delivery something you can count on.

The SD-CDN manages interactions between major commercial CDNs as needed, globally and on-the-fly. This tool also analyzes performance and works to prevent stream failure by automatically switching to standby equipment if potential issues are detected, which maximizes resiliency. This capability is especially critical when performing live streaming. With more and more companies live streaming their product launches and other events, making sure your video will scale to meet thousands of simultaneous viewers is essential.

Another important capability that is often overlooked is an internal enterprise delivery of a live video stream. We offer our clients an enterprise content delivery network (ECDN) to make simultaneous delivery of streaming video to employees work beautifully. This capability brings video content inside the firewall so that thousands of employees can concurrently watch a video stream without creating network bottlenecks.

For more insight on these capabilities, you can check out our “Definitive Guide to Enterprise Video”: http://info.ustream.tv/definitive-guide-to-enterprise-video.html

Stacy Nawrocki Stacy Nawrocki, Director of Product Marketing, IBM Cloud Video

Here are five rules for using audio/video media on your site:

  1. Avoid Autoplay. From my experience, autoplay isn't the way to go. A user could enter the page and walk away from their computer. The volume on the user's computer could be too high or too low and the autoplay could blast them out before they can adjust their volume. Additionally, autoplay could be a waste of bandwidth. Unless you run a video site, like YouTube or MetaCafe, don't use autoplay.
  2. Always Give Stop, Pause and Volume Controls. I've actually found myself on websites that autoplay media and don't provide stop, pause and volume controls. Needless to say, not providing media player controls is a terrible practice.
  3. Make the Media Player Easy to Find. Place your media players in an easy to find area on your site. Burying a media player on the page will only guarantee that your user sets the record for quickest "back button" click ever.
  4. Quality - Everything or Nothing. I can't stand when a video or audio snippet is placed on a website to support an article, but the media is of such poor quality that they actually damage the article's credibility. Provide quality video, even at the expense of load time - I'll wait for something good.
  5. Warn Users About NSFW Material. Unless your website is clearly NSFW as a whole, clearly label any audio or video "NSFW" (and definitely DO NOT auto play this material).

Link to full article: https://davidwalsh.name/rules-audio-video-media

“David David Walsh, Mozilla, evangelist and MooTools core developer, David Walsh Blog

When I think about responsive video, and anything responsive these days, I think about three main things:

  • Is the media fluid/responsive?
  • Am I serving the right codec?
  • What is the optimal size/weight of the video file I'm serving?

One option for responsive video is self hosting. Self hosting video is great because it's easy to implement on the front end. Gone are the days when you needed JavaScript or Flash a player like JWPlayer (my old personal favorite). Instead, you can now use the magical <video> tag and make your video work in the browser immediately. Of course there are a few snags, but let’s bask in the <video> tag glory for a while. The <video> tag is like a fancy <div>, so you can style it and make it super flexible with relative simplicity.

The problem with self hosting is that you're now paying for the bandwidth of video, and videos are not the lightest in bytes on the internet. You also need to encode the video yourself for all of the different types of video codecs that are required for cross browser/device/operating systems.

“Justin Justin Avery , director, digital consultant and responsive web design evangelist - author of ResponsiveDesign.is and RWD Weekly Newsletter

Delivery is one particular area of concern about videos at scale. Streaming video over the internet is no small feat, and doing it at scale can be very challenging. Amongst other things, it involves taking into consideration the heterogeneous set of viewing devices and supporting various codecs, such as HEVC/H.265, H.264, MPEG-4, Theora, VP9, VP8, VP6 and WMV.

One of the techniques for streaming and delivering video at scale is adaptive bitrate streaming (ABS).
ABS is a technique that adjusts the quality of a video stream in real time according to the users bandwidth and CPU capacity. It enables videos to start a little quicker with fewer buffering interruptions. You can generate multiple streams of a video with different resolutions, qualities and bitrate. These multiple streams, also known as variants, will need index and streaming files for each variant. Therefore, during playback, the video player will determine and select the optimal variant, such as screen resolution or player size, in response to the client environment and automatically switch between variants to adapt to changing network connections.

CDNs are also responsible for serving the video needed to play content during streaming. In order to maximize video delivery efficiency, ensure that content is served via servers close to your users/viewers/customers. For example, a user streaming video from the United States should be served content via a CDN located in/around the North America region, not the EU region. This technique goes a long way in providing a great viewing experience for your users. Cloudinary leverages multiple CDNs, including Akamai, the leading CDN provider, with thousands of worldwide edges.

Encoding compresses and transforms video source files to smaller files that users can view on their devices. Handling the overall video encoding process can be challenging. Companies, like Netflix, have built in-house solutions for encoding videos to various deliverable formats because they have enormous engineering resources at their disposal. As an organization or production company serving video content to users without a lot of resources to build your in-house solution, you can employ cloud-based encoding services that will provide on-the-fly video encoding. Cloud services, like Cloudinary, handle on-the-fly encoding effortlessly.

Prosper Otemuyiwa Prosper Otemuyiwa, Google Developer Expert and prominent OSS front-end developer

For video platforms owners, it has always been a headache to think about which standard to use to become multiplatform-compatible and achieve the maximum number of different devices, while maintaining the best video quality and user experience. Most of the first video platform deployments used progressive download techniques for VoD contents, or costly RTMP solutions like Flash Streaming for Live contents.

Adobe HDS, Microsoft MSS and Apple HLS are all proprietary adaptive HTTP streaming protocols that were introduced years ago to enable more efficient adaptive video delivery. They use different methodologies and formats, but none is compatible with the wide range of devices (mobile devices, smart TVs, computers, set-top boxes, gaming devices and others).

MPEG Dynamic Adaptive Streaming over HTTP (DASH) was born to be an international open standard to solve issues with internet video, such as video accessibility problems due to firewall, lack of plugins or bandwidth problems. But, as mentioned in the third Encoding.com 2017 Global Media Format Report, although DASH (21%) has shown strong growth year over year, HLS (72%) remains the most popular adaptive bitrate protocol.

Apple keeps its HLS standard as the only one compatible with its Apple devices, and this is a big constraint to adopt DASH as the only standard because it would mean losing the audience of all customers with Apple mobile devices. On the other hand, adopting both HLS and DASH would be costly as video platforms should encode and store both formats.

But, since June 2016, there is a new hero in the digital video industry, called CMAF (Common Media Application Format), with the aim of enabling a single media encoding to be used across many applications and devices. CMAF supports fragmented MP4 (fMP4) container, which is also supported by DASH and the renewed Apple HLS manifests. By adopting CMAF, it’s possible to encode and store unique video content that can be delivered at the same time with HLS and DASH manifests.

First tests with fMP4 encoding delivered using CMAF showed great results. Tests using CMAF multi-quality video contents with multiple audio and WebVTT subtitles were delivered and played with web players like Bitmovin, JWPlayer, TheoPlayer and DashIF. CMAF contents are also fully compatible with Apple and Android mobile devices with minor issues.

Therefore, the evolution in the broadcast of contents in OTT platforms is expected to improve notably, facilitating a single format for almost all platforms. Of course, there will be some old, incompatible devices and it will be necessary to assess whether the amount is big enough to preserve older distribution formats, such as Progressive Download or Flash.

Contributed by Francesc Mas, engineer consultant, CCMA


Delivering video at a large scale, especially through live streaming, is a daily challenge for many businesses because viewers want to be able to watch their videos at anytime, anywhere and from any device at the best quality possible. We aim to reach 1.9 billion users of online video by 2018, combined with a growing demand in terms of quality requirements: only 8.2% of viewers will return to a site during the stream if they experience a video failure. So, you’d better find out the best way to deliver your video content across the globe with minimal buffering and the best visual quality possible.

First, you want to make sure that all of your viewers can access your stream. Using HTTP Live Streaming (HLS) is currently the safest bet to reach the largest number of devices at a reasonable cost and a reasonable level of security. HLS, originally developed by Apple, is a live streaming protocol that cuts your stream into video chunks, or fragments, of about 10 seconds each. A HTTP server creates a manifest file, a playlist, that indexes the chunks so your player knows where they are stored and to make sure they are played back following the right order.

HTTP Live Streaming also supports adaptive bitrate streaming (ABR), which is the most recent delivery evolution, reducing a lot the buffering issues. This method creates multiple renditions of your stream and will play the highest quality possible by dynamically switching between bitrates according to the speed of the network connection.

As a second step, you want to look at a CDN, which is the safest bet to ensure a high-quality video delivery. Being composed of a network of servers around the world, a CDN can send your output stream to an edge server close to your viewer to minimize transmission problems and quality degradation. Just make sure your CDN has servers in the areas where most of your viewers are located, since not all CDNs are global or serve all locations equally.

Especially for live streaming, you want to use a top-tier CDN with a large-scale network of locations to avoid buffering issues as much as possible. You might have to commit to a considerable amount of bandwidth to partner directly with a large CDN, which might not be the best option in terms of pricing.

Often, the best solution is to use an online video platform that has already integrated with a top-tier CDN. This approach can help you benefit from the low pricing negotiated by the OVP along with other integrated tools that will help you optimize your video delivery, such as video analytics dashboard.

To learn more about DaCast online video platform, check out here: https://www.dacast.com/why-dacast-to-broadcast-live/

Elise Furon Elise Furon, product manager, DaCast

A key aspect of delivering video at scale that’s often overlooked is how to architect your infrastructure in a way that aligns with your business initiatives.

Here are some initial questions to consider when defining your video traffic profile:

  • Are you streaming live or video on-demand (VOD)?
  • How many videos are you creating and how quickly do you need to publish your content?
  • Which portion of your video traffic will drive business initiatives and needs to be dynamic (i.e., video manifests, user comments) rather than just focusing on caching and delivery (i.e., video segments)?

Choose a CDN that best serves your business objectives. Many of you may know that caching content at the edge of the network (closer to users) cuts origin infrastructure costs, while improving viewer experiences. But what about processing business logic at the edge? Or allowing request chaining for user authentication, feature flagging or A/B testing? Find out what your CDN can do beyond byte delivery to unlock the full potential of this often overlooked piece of your infrastructure.

When architecting your video service, you also should define key performance indicators so you can measure how well you’re tracking towards your goals. You’ll want to evaluate components that give you the visibility you need to uncover insights and the control to take incremental action.

The final piece is analyzing total cost of ownership of your architecture. Go beyond analyzing the prices of the different vendors, because that doesn’t tell the whole story. Consider all the elements of operating your solution: How much time and effort does it take to manage and maintain? How quickly does the support team resolve your issues? Is it self-serviceable?

Ashok Lalwani Ashok Lalwani, head of Video Product,Fastly

There is nothing worse for a customer than waiting, staring anxiously at a frozen uploading progress bar. Unfortunately, videos, which tend to be large files, can take several minutes to upload, especially when being uploaded from mobile devices with spotty internet connections. As the makers behind Filestack, the No. 1 developer service for uploading and managing files, we’d like to share some tips for uploading large files:

  1. Chunk large files into smaller pieces. The best way to upload a large file quickly and effectively is to break it up into more manageable small files. You can specify the size of the pieces to optimize for network conditions. When you chunk video files, you reduce the risk of timeouts. If a small chunk fails to upload for whatever reason, you can automatically retry only that chunk, instead of having to restart the entire upload.
  2. Upload asynchronously. When you upload asynchronously, the upload happens behind the scenes, so your users can continue to use your app as usual, instead of having to stare longingly at the slowly moving progress bar. Instagram does a great job of this, returning you to your feed after you post, showing you that your video is still uploading.
  3. Use a Content Ingestion Network (CIN). A CIN is essentially a reverse CDN. While a CDN has a globally distributed network of server points of presence (POPs) to deliver content, a CIN has a globally distributed network of ingestion POPs to upload content from users nearby. CINs regularly increase upload speed by 10x, which makes a huge difference if you are cutting user wait time from 60 seconds to only 6 seconds for a video to upload.
Bethany Stachenfeld Bethany Stachenfeld, marketing manager,Filestack

It's no secret that the internet is no longer a network consisting primarily of static and dynamic content. The Internet has become a video delivery platform. For new media publishers, social media platforms and websites that lean heavily on video to convey their message or product information, the need to focus on video quality, while optimizing for smaller screens and congested networks, has never been higher. To facilitate the broadest coverage possible, Beamr recommends video encoding approaches that operate in symbiosis with existing standards: H.264/AVC or H.265/HEVC.

Though the H.264 codec is ubiquitous and supported widely across devices and browsers, the next-generation codec H.265/HEVC has been proven to be 30 percent more efficient at the same quality. This added performance - when combined with the power of frame-level optimization using a perceptual quality measure - is establishing a new quality and UX bar.

Block-based video encoding schemes are inherently lossy methods that achieve compression by removing information from the bitstream, taking into account the effect of such information on the perceived visual quality of the video and the file size. A video encoder rate-control algorithm adjusts the encoding parameters to achieve a targeted bitrate. This algorithm allocates a budget of bits to each group of pictures, individual frames, and in some cases, subframes in a video sequence.

Beamr has developed a process for content-adaptive encoding based on closed-loop re-encoding of input frames at various compression levels, while checking the value of Beamr’s patented perceptual quality measure, which has a high correlation with human subjective results. The input to the Beamr optimization process is a video file (when used in the Beamr 5x encoder, YUV is the input) that has been compressed at the quality level desired.

By evaluating the quality of the video on a frame-by-frame basis, this method of content-adaptive encoding ensures that the results are optimal and the video file is encoded to the lowest possible size, while fully retaining the quality desired. The advantages of integrating a quality driven rate-control process into the video encoding process is an average 20 percent to 50 percent reduction in file/stream size. This level of saving brings tremendous benefits and ROI to any large website operator. For those engaged in video distribution, the adoption of HEVC, along with frame-level optimization contributes cost savings and UX benefits that directly translate into improved engagement and content monetization.

Mark Donnigan Mark Donnigan, Vice President, Marketing,Beamr

When delivering videos at scale, there are a few things you should consider to ensure the best playback experience. Videos are big files, but the time to first frame can be actively reduced. Most importantly, you'll want the video files cached as close as possible to your users; your CDN's edge servers will help with that. You'll also want to leverage the browser's ability to preload the video using the ‘preload’ attribute in the video tag, so that playback, once triggered, takes an instant instead of waiting for the buffer to fill.

For the lowest possible time-to-first-frame, make sure you use properly encoded MP4 video files in which the first frame is a keyframe. If the first frame is not a keyframe, the video will start by showing a black image until the first keyframe kicks in. Additionally, the MOOV atom should be placed at the beginning, it's downloaded first. The MOOV atom holds information about the length of the video and is needed by the browser to enable seeking.

Octavian Naicu Octavian Naicu, founder and product manager,Pipe

Here’s a quick look at the VOD and live bitrate requirements in Apple’s HLS authoring specification for Apple devices. Following are three basic requirements for VOD data rates:

  • The average segment bit rate must be within 10 percent of the AVERAGE-BANDWIDTH attribute.
  • The measured peak bit rate must be within 10 percent of the BANDWIDTH attribute.
  • The peak bit rate should be no more than 200 percent of the average bit rate.

Working through the first requirement is simple. Basically, it means that the average bitrate for each segment must be within 10 percent of the AVERAGE-BANDWIDTH attribute, which, according to 4.3.4.2 of the Pantos spec, measures the “average segment bitrate of the variant stream.” Basically, it’s an accuracy requirement: Your actual average must be within 10 percent of the stated attribute in the M3U8 file.

The second and third requirements are potential causes for confusion, because it looks like the second requirement demands a peak within 10 percent of file bandwidth, while the third requires a peak no more than 200 percent of the average bitrate. Once you check the Pantos spec, however, you see that the BANDWIDTH tag represents the “peak segment bit rate of the Variant stream.” It’s not the average bandwidth, it’s the peak. So again, number two is an accuracy requirement in that the peak bit rate should be within 10 percent of the peak bitrate represented in the BANDWIDTH tag.

In contrast, the third is an encoding requirement and an accuracy requirement. That is, the peak bitrate should be no more than 200 percent of the average bit rate, which means that 200 percent constrained VBR is an acceptable technique. I’m more conservative, and recommend 110 percent constrained VBR because of what I learned in an article entitled “[Bitrate Control and QoE-CBR is Better].(http://streaminglearningcenter.com/articles/bitrate-control-and-qoe-cbr-is-better.html)” That said, you can encode at up to 200 percent constrained VBR and meet the Apple spec.

The requirements for live video are much simpler.

  • The average segment bit rate over a long (~1 hour) period of time must be less than 110 percent of the AVERAGE-BANDWIDTH attribute.
  • The measured peak bit rate must be within 25 percent of the BANDWIDTH attribute.

The first one, again, is an accuracy requirement; if you say the stream is 4 Mbps, it should be between 3.9 and 4.1 Mbps. The second is an encoding requirement and an accuracy requirement that means that the peak bit rate MUST be within 25 percent of the peak bitrate represented within the bandwidth tag. Since the only encoding technique you can use for live is CBR, this shouldn’t be a problem.

Apple could make all of this easier to remember if they changed the name of the BANDWIDTH tag to PEAK BANDWIDTH, which is really what it means. Since that’s unlikely to happen, if you just remember that BANDWIDTH means peak bandwidth, you should be able to keep it straight.

Jan Ozer Jan Ozer, expert on H.264 encoding, contributing editor to Streaming Media Magazine

Video is already a critical media for business and its importance will keep growing. By 2019, 80 percent of the world's internet traffic will be video, says Cisco. But every additional second in page load increases risk of abandonment on your website. Video performance is difficult to maintain as you face two events when delivering videos to customers: Initial buffer fill, which is the time it takes for your video to start playing, and lag time, which is the time the video pauses to get the next chunks to play.

To ensure your content is delivered correctly, you should anticipate and run load tests on your infrastructure so that you can fine tune, add software layers, and know your users’ future experience before they let you know how bad it is. To ensure your load test is useful, try the following approaches:

  • Ensure you don’t face the cache effect, which makes you think everything is OK even if it’s not.
    • Use many video streams: Your users will surely watch different movies or ads, you need to reflect that in your test.
    • Vary bandwidth: Your users don’t have the same network bandwidth, particularly mobile users. Vary the bandwidth available to your virtual users to simulate the different networks and know the lags per bandwidth available.
    • Vary source locations: Your users will probably come from different locations in your country or worldwide, ensure you use different source locations. The cloud is your friend.
  • Check all the content you deliver:
    • To ensure your users have the best experience, you’ll have to deliver at least two or three of the most popular formats.
    • Ensure you test all these formats under load. This can be tedious, but fortunately there are tools that can help you.
  • Reproduce the players’ behavior: Load testing streaming servers realistically is not easy, as a player does many things at the same time that can impact user experience:
    • Download streams: This will occur on startup and while player is playing the video to ensure video keeps playing smoothly.
    • Contact DRM servers to check rights: These can be third parties for which you would define SLAs, but you can also host them. In the latter case, you’ll need to ensure they correctly handle the load. *Play stream that involves decoding: This is more a client (player) issue so it is less critical, but to ensure you provide the best experience, use popular and fast codecs to encode your video.
  • Collect the critical metrics: Finally, load testing metrics are a bit different for video streaming load testing. Besides the usual metrics, like connect and download time, in order to understand user experience, you should track the following:
    • Initial buffer time
    • Lag times
    • Lag ratio
Philippe Mouawadr Philippe Mouawad, leading development of UbikLoadPack.UbikLoadPack is a set of Apache JMeter-based plugins for load testing Video Streaming protocols (MPEG-DASH, HLS, MS Smooth).

Recent Blog Posts

Offline First Masonry Grid Showcase with Vue

To keep your product revelant in the market, you should be building Progressive Web Apps (PWA). Consider these testimonies on conversion rates, provided by leading companies, such as Twitter, Forbes, AliExpress, Booking.com and others. This article doesn't go into background, history or principles surrounding PWA. Instead we want to show a practical approach to building a progressive web app using the Vue.js library.

Read more
Optimize Images: Novactive eZ Platform Cloudinary Plugin

At Novactive, we are always excited to use new technologies and/or to improve our favorite technologies with other ones when it makes sense for us, for our clients and for the community.

Our business is web technologies, and the most professional content management system (CMS) for us is eZ Platform (previously eZ Publish). That's why we love creating connectors to this CMS. Our most recent eZ project is an image management plugin using Cloudinary.

Read more
Are your website images ready to embrace the iPhone X notch?

The announcement of the iPhone 8, 8 plus and the iPhone X at the iPhone 10th anniversary Apple event on 12th September 2017, came with a massive buzz.

The iPhone X (pronounced "iPhone 10") has a 5.8-inch super retina screen, which stands out amongst other Apples, giving the machine the tagline “it is all screen”. A resolution of 2436 x 1125 is also featured, making it the highest resolution iPhone to date.

Read more

Image Optimization: Expert Roundup

By Gilad Maayan
Image Optimization: Expert Roundup

Unoptimized images can be incredible bottlenecks that turn an otherwise well-built web page into a slow loading, bloated one. With the ever-increasing percentage of users on mobile devices, properly sizing and optimizing images to provide a good mobile experience is even more important.

Read more