Cloudinary Blog

Improve the Web Experience With Progressive Image Decoding

Progressive Image Decoding Delivers an Enhanced Web Experience

Progressive image decoding is an excellent way in which to accelerate page loads and hence improve the web-browsing experience. This post explains why and elaborates on the recent developments for that approach.

The Importance of Image Compression

Some people say that since internet speeds are continually trending faster, we don’t really need to enhance image compression. They believe that JPEG is good enough and that, in particular, progressive decoding belongs to the past, important for web surfing in the early 1990s with slow dial-in modems, which are no longer in use in the modern world.

I think those people are wrong. Yes, the internet is faster. However, not everyone has high-speed internet. Those who do—at home or at work—can't access it at all times, not while they're traveling. Separately, the faster internet has led to heavier websites, with the web becoming way more visual with ever more and larger images and videos. Images represent a large amount of data: every pixel consists of at least three numbers (R, G, and B), each number requiring at least 8 bits. So, without compression, 1 megapixel equals 3 megabytes. Given that the median webpage contains 2.1 MP worth of images, sending them, uncompressed, on a 3-Mbps, 3G connection would take at least 17 seconds—a long wait!

Plus, we desire high-resolution images and ones that require a wide color gamut and a high dynamic range, not achievable with 8-bit encoding. Bottom line: image compression remains a must-do.

In essence, lossless image compression is simple. The hard part is to find a more concise representation but, in the end, it stands for exactly the original pixel values that went in. For typical photographs, lossless compression accords you a compression ratio of 2:1 only, or maybe 3:1, which translates to 1 megapixel in 1 megabyte instead of 3. Not bad, but not good enough.

Remarkably, lossy compression can easily deliver ratios of 20:1 with no visible artifacts. In the ideal scenario, those artifacts are only numerical differences between the original and the decoded pixel values. Visually, unless you zoom in a lot, the images look the same, yet lossy compression brings a 1-megapixel image down to a much more manageable size of 150 KB.

Remember, you compress online images to improve the browsing experience. Data caps aside, file sizes matter because they determine how long users must wait to see your images. The smaller the files, the faster the images appear and the more pleasing the user experience.

Hence the promise of progressive decoding, which enables browsers to display image content before the files have finished loading.

Progressive Decoding

What’s progressive decoding? Clever image codecs organize compressed bits in such a way that even a partially—say, 10-percent—loaded image, can be decoded, resulting in a lower-quality (or lower-resolution) preview. The 30-year-old JPEG codec can do that, but that feature, optional and underused, is enabled by default in fancy JPEG encoders only, like mozjpeg.

Progressive decoding can improve the browsing experience by another order of magnitude: not only can it reduce a 3-MB, uncompressed image to 150 KB, it displays the image after downloading a mere 15 KB. To see the fine details, you must wait until the transfer is complete. However, if you’re just scrolling through the webpage, chances are that you’ll get an idea of the image from the preview. For the median webpage, lossy compression shortens the 17-second image-loading time to only one second, and progressive decoding can cause loading to proceed unnoticeably fast.

Image Versus Video

For video codecs, progressive decoding of a single frame is a waste of time. That’s because videos contain many frames, displayed in rapid succession, and you must buffer enough of the compressed video data before it makes sense to start playback.

Nonetheless, many new image codecs are derived from video codecs: WebP is basically a single-frame VP8 WebM video; HEIC is a single-frame HEVC video; and AVIF is a single-frame AV1 video. Because of their video origins, however, they don’t support progressive decoding. Too bad—even though those formats can reach higher compression densities, you must wait until all or most of the image data has loaded before you can see anything.

As a result, for all that AVIF’s superior compression capability could, for example, turn a 150-KB JPEG into a 75-KB AVIF, the first preview might paradoxically take four times longer to display. In other words, when 20 KB of the progressive JPEG image has loaded, a reasonably promising preview becomes available. For the AVIF, you must wait for the arrival and decoding of all 75 KB. Besides, the more complicated AVIF format takes longer to decode than the JPEG format.

Previews and Placeholders

To use nonprogressive codecs like WebP and AVIF but still generate a somewhat progressive browsing experience, leverage Low Quality Image Placeholders (LQIPs). In that case, you first serve a low-quality version of your images and then replace them with the actual ones with, for example, JavaScript.

The spectrum is wide, ranging from mere placeholders (really, really low-quality previews, e.g., a simple gradient of two predominant colors or a very blurry version of the image based on a dozen pixels only) to low-quality previews that can clue users in on the images, such as “quality 30” images as previews for the actual “quality 80” ones. In the case of AVIF and JPEG XL, you can embed LQIPs, saving the step of replacing the image externally.

The downside of separate previews or placeholders is that the total transfer size inevitably goes up. The enhanced browsing experience delivered by the preview deteriorates because it takes longer for the final image to arrive, and all the bytes necessitated by the preview or placeholder, which is separate and redundant, are, ultimately, wasted. The smaller the LQIPs, the lower their overhead—but also the less useful as a preview..

In contrast, progressive decoding does not waste bytes on separate previews: the first bytes of the actual high-quality image are the preview image. Talk about a welcome feature!

Improved Progressiveness

The state of the art of progressive images, which are as old as JPEG, has remained largely the same for 20 or 30 years. Excitingly, that’s starting to change.

First, the green martians I blogged about before—which can happen if the first luma and chroma information is not simultaneously available—are no longer an issue because browsers now wait until both chroma channels are available before showing a preview.

First program scan

Another recent improvement is in the upsampling techniques that show the first preview of a progressive JPEG, which is an image at 1:8 resolution. Basically, one pixel is available as the average color for every 8x8 block, also called the direct current (DC) coefficient. The simplest possible upsampling would yield a very blocky preview, for which you just fill all the 8x8 blocks with the DC value, as here:

Upsampling technique

Now reaching browsers is an improved upsampling technique, which creates a less artifacted, more appealing preview:

Improved upsampling technique

Those techniques are for progressive JPEGs. More enhancements are forthcoming for JPEG XL. An example is that you can progressively encode the DC itself in JPEG XL to more speedily generate the first preview. Normally, it takes 10 to 15 percent of the total file size to get the DC, which is the first full-image preview for a progressive JPEG. With progressive DC, a feature of JPEG XL, you can create a first LQIP when only one percent of the total file size has arrived.

JPEG XL offers two more options for advanced progressive encoding:

  • Middle-out scans: In JPEG, scans are always top to bottom. In JPEG XL, for which encoding occurs in groups of 256x256 pixels, you can reorder the groups. So, you can start each and every scan with the groups in the middle, which presumably contain the most enticing part of the image.
  • Saliency progression: Progressive scans of JPEGs must provide the same amount of new detail for every part of the image. Not so in the case of JPEG XL. That means you can progressively encode images based on saliency, such as by sending the faces or foreground objects in an image in more detail first, and the background later.

Largest Contentful Paint

Largest Contentful Paint (LCP) is a new user-experience metric Google will adopt to determine the ranking of search results. Even though discussion is still ongoing, a consensus has been reached to consider progressive rendering as an LCP factor.

In general, enhanced progressive rendering leads to perceived faster web performance and improved user experience. LCP will better capture those refinements, leading to higher Google-search rankings and stronger SEO.

The Expediency of JPEG XL

Unlike WebP, HEIC, and AVIF, JPEG and JPEG XL were designed for progressive decoding. The progressive capabilities of JPEG XL are superior to JPEG’s, however. Recall that reasonably appealing LQIPs become available with only a one-percent transfer of image data—and no need for separate and redundant LQIPs or preview images.

In summary, JPEG XL is a boon for the browsing experience, reducing bandwidth and displaying images faster and with higher fidelity. I’ll keep you posted on the format’s development.

My next article will discuss what it takes to create a codec to replace JPEG and why previous attempts failed. Stay tuned.

Recent Blog Posts

How to Use the Cloudinary Media Editor Widget

At Cloudinary, we manage the entire pipeline of media assets for thousands of customers of varying sizes from numerous verticals.

As part of our commitment to support the entire flow of media assets, we are now introducing an intuitive media editing widget: an out­-of­-the-­box, interactive UI providing your users with a set of common image editing actions for immediate use on your website or web app. The widget is interactive and simple, built on Cloudinary's transformation capabilities, and requiring only a few lines of code to integrate. Afterwards, you can seamlessly and effortlessly add content to your site or app with no need for in-house image editing capabilities.

Read more
Shoppable Video Is Becoming Popular in E-Commerce

As pandemic restrictions necessitated, many shopping trips in 2020 took place outside the traditional brick-and-mortar store, or at least void of the physical aisle-browsing experience. Same-day curbside pickup became a safe and convenient alternative, and e-commerce transactions skyrocketed as consumers shopped online. In fact, Digital Commerce 360 estimates that, compared to 2019, e-commerce transactions grew by more than 40% last year.

Read more
Enhance Your Travel Site With Cloudinary in Anticipation of a Return to New Normal

Read more
The Benefits of Headless DAMs

Headless is not a buzzword anymore. In fact, the concept of headless architecture is gaining momentum due to the flexibility it offers for composing new experiences and for tackling the undue complexity of an ever-evolving technology stack. That’s because while the evolution of the martech landscape has enabled disruptive, digital innovations, the approach of buying point solutions for solving specific challenges can expose companies to the complicated nature of new technologies, systems, and platforms.

Read more

Building Display Ads With Transparent Video

By Afzaal Ahmad Zeeshan
Build Web Ads With Transparent Video to Attract User Engagement

Billions of views on the Internet every day drive one of the biggest industries on the planet: advertising. The sheer size of that market and the competitive nature of vying for consumer attention results in a constant need for innovation. Readers are jaded, and display ads are blind spots.

Read more