Cloudinary Blog

Which Image Compression Technique Looks Best to Human Eyes?

Which Image Compression Technique Looks Best to Human Eyes?

Here at Cloudinary, we provide a cloud-based tool that enables our users to compress images and video for their websites and apps. Our goal is to preserve the visual integrity of the content, but deliver the smallest file size to any device or browser to ultimately optimize website performance and end user satisfaction.

One of the hallmarks of the Cloudinary solution is the ability to automate many functions of image compression, so that developers don’t have to spend time tweaking each photo and making multiple copies of different sizes and resolutions to fit every possible scenario. Compression algorithms can be tricky because they’re trying to make changes that have the smallest visual impact, but different images can react differently to compression.

As we were developing the algorithm for our “q_auto” capabilities – which strikes a balance between visual quality and file size – we needed to test it to understand how the resulting images compared to the human eye. Enter Scale API.

Many image compression formats – like JPEG 2000 and JPEG XR – have been tweaked to score well on particular metrics, such as peak signal-to-noise ratio (PSNR). But these don’t always correlate with human perception on image quality.

We leveraged Scale API to compare pairs of images and give us perspective on which image was liked most by humans. With Scale API, we did a variety of tests, comparing several formats, including WebP, JPEG 2000, JPEG XR (lossy) Lepton (MozJPEG, recompressed with Lepton), FLIF, BPG, Daala, and PNG8 (pngquant+optipng). We also were able to get input on the difference between the uncompressed original image vs. a compressed version.

Scale API enabled us to create A/B comparisons that were viewed by human observers. We submitted over 4,000 image comparisons to Scale API, sending at least four independent Scale API requests for each pair of image. This resulted in at least eight actual human comparisons for each pair of images. The outcome of these comparisons were evaluated beside other perceptual metrics such as PSNR, Google’s Butteraugli, DSSIM (Structural (Dis)Similarity) and a new metric Cloudinary developed called SSIMULACRA (Structural SIMilarity Unveiling Local And Compression Related Artifacts).

The results showed that overall, PSNR is “correct” in only 67 percent of the cases. Butteraugli gets it right in 80 percent of the cases, and DSSIM in 82 percent of the cases. Our new metric, SSIMULACRA, agrees with human judgments in 87 percent of the cases. Looking just at the high-confidence human judgments, we found about 78 percent agreement for PSNR, 91 percent for both Butteraugli and DSSIM, and almost 98 percent agreement for SSIMULACRA. You can read more about SSIMULACRA and these results on the Cloudinary blog. Or if you want to give it a try: SSIMULACRA is free and open-source software!

The results of Scale API comparisons gave us useful data points to validate our metrics and provided more insights into the compression benchmarks we are running and the comparison of various image formats. And from these insights we were able to improve our visual perception metrics and fine-tune our “q_auto” functionality so we know how aggressively we can compress images.

Through this process we were impressed not only by the useful data points derived from the Scale API, but also the great support we got from the company and the product’s ease-of-use, all which came at a reasonable price.

This was originally posted on Scale API

Recent Blog Posts

CoreMedia Adds Cloudinary to its CoreMedia Studio Platform

Today we’re pleased to announce a new technology partnership with CoreMedia, a leading Content Experience Platform provider. CoreMedia users can now leverage Cloudinary’s web-based digital asset management (DAM) solution to organize, search, manage and optimize their media assets, including images and videos, and to orchestrate, preview and deliver digital experiences consistently and optimized across all channels and browsers. The official press release is available here.

Read more
Facial-Surveillance System for Restricted Zones

In Africa, where Internet access and bandwidth are limited, it’s not cost-effective or feasible to establish and maintain a connectivity for security and surveillance applications. That challenge makes it almost impossible to build a service that detects, with facial-recognition technology, if someone entering a building is authorized to do so. To meet the final-year research requirement for my undergraduate studies, I developed a facial-surveillance system. Armed with a background in computer vision, I decided to push the limits and see if I could build a surveillance system that does not require recording long video footage.

Read more
Complex Networks Case Study

Complex Networks has been using Cloudinary since 2014 to manage and optimize images across seven websites and two mobile apps, making editorial workflow more efficient, improving page performance and load time, and increasing user engagement. Cloudinary was instrumental in enabling Complex Networks to redesign its web properties. Without the flexibility that Cloudinary offers to both creative and development teams, it would not have been possible for Complex Networks to achieve such a fast time to market.

Read more
Automate Placeholder Generation and Accelerate Page Loads

If you run a Google search on LQIP you’ll see very few relevant articles, very little guidance, and definitely no Wikipedia articles. In this post, we’ll discuss some of the feedback on LQIP we have gathered from the community and suggest and open for conversation a few approaches based on the built-in capabilities of the Cloudinary service. Specifically, we’ll explain what LQIP are, where they are best used, and how you can leverage them to accelerate page loads and optimize user experience.

Read more
Best Practices for Optimizing Web Page Speed

If you're like most consumers today, you engage more with pictures or videos on a website than text. The stats don't lie - four times as many visitors would rather watch a video about a product than read about it, and sites with compelling images average twice as many views as text-heavy ones.

Read more
A day of fun with Girls Who Code and Cloudinary

During both my computer science studies and work in the tech field, there have not been a lot of women present. While our ranks have grown, women still make up only a small percentage. In many ways, I think the traditionally male-dominated world can be intimidating to women and girls who may be interested in pursuing these types of tech careers.

Read more