Cloudinary Blog

FUIF: Why Do We Need a New Image File Format?

New Image File Format: FUIF: Why Do We Need a New Image Format

In my last post, I introduced FUIF, a new, free, and universal image format I’ve created. In this post and other follow-up pieces, I will explain the why, what, and how of FUIF.

Why Do We Need Another Image File Format?

Even though JPEG is still the most widely-used image file format on the web, it has limitations, especially the subset of the format that has been implemented in browsers and that has, therefore, become the de facto standard. Because JPEG has a relatively verbose header, it cannot be used (at least not as is) for low-quality image placeholders (LQIP), for which you need a budget of a few hundred bytes. JPEG cannot encode alpha channels (transparency); it is restricted to 8 bits per channel; and its entropy coding is no longer state of the art. Also, JPEG is not fully “responsive by design.” There is no easy way to find a file’s truncation offsets and it is limited to a 1:8 downscale (the DC coefficients). If you want to use the same file for an 8K UHD display (7,680 pixels wide) and for a smart watch (320 pixels wide), 1:8 is not enough. And finally, JPEG does not work well with nonphotographic images and cannot do fully lossless compression.

JPEG XR (WDP) and JPEG 2000 improve upon JPEG and take away some of its limitations. For example, they do support transparency and higher bit depth. Nonetheless, they’re still designed specifically for photographic images and have a rather long header.

As a result, existing JPEG formats are not really what we want. The JPEG Committee has launched a call for proposals for a next-generation image format. In the call, they specify “efficient coding of images with text and graphics” as a core requirement and “very low file size image coding” as a desirable requirement. I have submitted FUIF in response to that call.

Standards
Image source xkcd

WebP can be used for both photographic images (by means of its lossy, VP8-based mode) and nonphotographic images (by means of its lossless, PNG-like mode). However, it also has some important limitations. It is restricted to 8 bits and has obligatory chroma subsampling in its lossy mode. Also, because WebP does not have a progressive encoding, it’s is certainly not “responsive by design.”

Just like WebP, many other modern image formats are based on the intraframe encoding of a video codec. Better Portable Graphics (BPG) and High Efficiency Image File (HEIC) are both based on the High Efficiency Video Coding (HEVC) video codec; AV1 Image Format (AVIF) is based on AV1. They are excellent replacements for JPEG for cameras and phones. Thanks to tiling and hardware acceleration, fast encoding and decoding are possible with much better compression. But again, BPG, HEIC, and AVIF are not “responsive by design.” Furthermore, AV1 is an open and royalty-free video codec; HEVC is not, let alone that it’s very much patent encumbered.

What Are FUIF’s Design Principles?

As far as I know, FUIF is free in the sense that it is royalty free along with a free reference implementation that is based on open-source software.

Additionally, FUIF is an image format, not an image container like HEIF. FUIF can function as a stand-alone image file format but is minimalistic in that it just stores pixels and the minimal metadata must render those pixels. Any additional capability is delegated to the container format (e.g., HEIF), like layering, tiling, and geometric transformations, such as orientation, cropping, comments, Exif metadata, and annotations. This is a matter of separation of concerns and avoidance of duplication of functionality. Even though FUIF does support simple GIF-like animation, many animations are better encoded as a short video by taking advantage of interframe prediction, which is beyond the scope of image formats.

Finally, FUIF is a universal image format in the following ways:

  • Any kind of image
    • JPEG works great for photographic images. PNG is often better for nonphotographic images, like screenshots, illustrations, diagrams, cartoons, logos, and game graphics. It is not practical to require different image formats for different image content (or different subformats within a file format, e.g., lossy WebP and lossless WebP). No matter that different image contents require different compression techniques, those techniques in FUIF correspond to internal image transformations you can apply within the same format.
  • No arbitrary limitations
    • FUIF has no inherent limitations in terms of maximum image resolution, bit depth, or number of channels. Obviously, a practical implementation must impose limits (if only for security and performance reasons), but the format itself has no arbitrary limitations. Hence, at some point, if you want to do a 20-bit CMYK color model with an alpha channel and a depth channel, FUIF would fit the bill.
  • Any quality
    • From very low bitrates to lossless, FUIF aims to be at the Pareto front of compression density for a given perceptual (or objective) quality.
  • Any computational complexity
    • For one-to-many delivery, it is often OK to spend a lot of time to encode an image. In other use cases, a fast and predictable encode time is desired. FUIF’s modular approach in both its internal image transformations and entropy coding makes it possible to obtain a wide spectrum of trade-offs between encode time and compression density.

Besides FUIF being a free and universal image format, three other design principles guided its development:

  • First, as discussed in my previous post, FUIF is responsive by design. You need only one single file instead of having to downscale a high-resolution original to various target resolutions. FUIF has a minimalistic, compact header layout, so the first bits of actual image data appear as early as possible. That means you can produce an LQIP module within a small byte budget from the first few bytes of the full image.

    Furthermore, a FUIF file header contains a mandatory list of truncation offsets, from which you can easily figure out how many bytes to request or serve. In the context of a web page, it would make sense to embed the first few hundred bytes of each image in the HTML code so as to display LQIP images without additional HTTP requests. At the same time, the browser would have the actual image dimensions and truncation offsets for sending an HTTP range request for the extra bytes, if and when necessary.

  • Second, FUIF is legacy friendly, which I will explain in detail in my next post.

  • Third, the compression artifacts of FUIF are “honest”. That concept will also be a topic of a future post.

I hope the first two posts of this series on FUIF have helped you understand the reasons why we need yet another image file format and the behind-the-scenes design principles. Stay tuned for the upcoming posts on legacy friendliness and compression artifacts.

Note
The FUIF Code has now been made public.


Further Reading on Responsive Images

Recent Blog Posts

Create Lightweight Sites With Low-Code and No-Code Technology

Consumers expect modern websites to be mainly visual. But, the more compelling and complex the related media is, the more data is involved, compounding the site’s weight. In today’s content-craving world, delivering unoptimized media can cost you because it leads to sluggish page loads, resulting in visitors abandoning your site in search of a faster alternative. In fact, a page load that takes more than three seconds can cause as many as 40% of your visitors to bounce. Given this competitive, digital-first environment, you can’t afford to lose page views, for time is of the essence.

Read more
A Blueprint for AWS-Secured Webhook Listeners for Cloudinary

tl;dr: An AWS-secured and optimized Cloudinary webhook listener for extending the Cloudinary service

Code: Github

A webhook is a communication medium for sending notifications from one platform to another about events that occurred. In place are user-defined HTTP callbacks that are triggered by specific events. When a triggered event takes place on the source site, the webhook listens to the event, collects the data, and sends it to the URL you specified in the form of an HTTP request.

Read more
New Accessibility Features for Cloudinary’s Product Gallery Widget

Cloudinary’s Product Gallery widget, which launched in 2019, has enabled many brands to effectively and efficiently showcase their products in a sleek and captivating manner, saving countless hours of development time and accelerating release cycles. By adding Cloudinary’s Product Gallery widget with its customizable UI to their product page, retailers reap numerous benefits, often turning visitors into customers in short order.

Read more
Why Successful Businesses Engage With and Convert Audiences With Visual Media

Most business buyers prefer to research purchase options online, as do many shoppers. No wonder online retail sales in the U.S. rose by 32.4% in 2020—an impressive gain of $105 billion.

For B2B and B2C businesses, text-heavy websites are no longer adequate in attracting shoppers. Instead, engaging visual media—spin images, videos, 3D models, augmented reality—are becoming a must for conveying eye-catching details and differentiators about products or services.

Read more
Making User-Generated Content (UGC) Shoppable With Cloudinary

User-generated content (UGC) is a powerful marketing tool. Not only does video complement marketing efforts for e-commerce by enabling customers to explore products in greater detail, but UGC also adds an element of trust. As a bonus, user-generated video is an exceptional opportunity for e-businesses to attract website traffic without their marketing team having to create promotional videos from scratch. User-generated content drives conversions and brand loyalty as a direct result of authentic interaction.

Read more