Cloudinary Blog

FUIF: Why Do We Need a New Image File Format?

New Image File Format: FUIF: Why Do We Need a New Image Format

In my last post, I introduced FUIF, a new, free, and universal image format I’ve created. In this post and other follow-up pieces, I will explain the why, what, and how of FUIF.

Why Do We Need Another Image File Format?

Even though JPEG is still the most widely-used image file format on the web, it has limitations, especially the subset of the format that has been implemented in browsers and that has, therefore, become the de facto standard. Because JPEG has a relatively verbose header, it cannot be used (at least not as is) for low-quality image placeholders (LQIP), for which you need a budget of a few hundred bytes. JPEG cannot encode alpha channels (transparency); it is restricted to 8 bits per channel; and its entropy coding is no longer state of the art. Also, JPEG is not fully “responsive by design.” There is no easy way to find a file’s truncation offsets and it is limited to a 1:8 downscale (the DC coefficients). If you want to use the same file for an 8K UHD display (7,680 pixels wide) and for a smart watch (320 pixels wide), 1:8 is not enough. And finally, JPEG does not work well with nonphotographic images and cannot do fully lossless compression.

JPEG XR (WDP) and JPEG 2000 improve upon JPEG and take away some of its limitations. For example, they do support transparency and higher bit depth. Nonetheless, they’re still designed specifically for photographic images and have a rather long header.

As a result, existing JPEG formats are not really what we want. The JPEG Committee has launched a call for proposals for a next-generation image format. In the call, they specify “efficient coding of images with text and graphics” as a core requirement and “very low file size image coding” as a desirable requirement. I have submitted FUIF in response to that call.

Image source xkcd

WebP can be used for both photographic images (by means of its lossy, VP8-based mode) and nonphotographic images (by means of its lossless, PNG-like mode). However, it also has some important limitations. It is restricted to 8 bits and has obligatory chroma subsampling in its lossy mode. Also, because WebP does not have a progressive encoding, it’s is certainly not “responsive by design.”

Just like WebP, many other modern image formats are based on the intraframe encoding of a video codec. Better Portable Graphics (BPG) and High Efficiency Image File (HEIC) are both based on the High Efficiency Video Coding (HEVC) video codec; AV1 Image Format (AVIF) is based on AV1. They are excellent replacements for JPEG for cameras and phones. Thanks to tiling and hardware acceleration, fast encoding and decoding are possible with much better compression. But again, BPG, HEIC, and AVIF are not “responsive by design.” Furthermore, AV1 is an open and royalty-free video codec; HEVC is not, let alone that it’s very much patent encumbered.

What Are FUIF’s Design Principles?

As far as I know, FUIF is free in the sense that it is royalty free along with a free reference implementation that is based on open-source software.

Additionally, FUIF is an image format, not an image container like HEIF. FUIF can function as a stand-alone image file format but is minimalistic in that it just stores pixels and the minimal metadata must render those pixels. Any additional capability is delegated to the container format (e.g., HEIF), like layering, tiling, and geometric transformations, such as orientation, cropping, comments, Exif metadata, and annotations. This is a matter of separation of concerns and avoidance of duplication of functionality. Even though FUIF does support simple GIF-like animation, many animations are better encoded as a short video by taking advantage of interframe prediction, which is beyond the scope of image formats.

Finally, FUIF is a universal image format in the following ways:

  • Any kind of image
    • JPEG works great for photographic images. PNG is often better for nonphotographic images, like screenshots, illustrations, diagrams, cartoons, logos, and game graphics. It is not practical to require different image formats for different image content (or different subformats within a file format, e.g., lossy WebP and lossless WebP). No matter that different image contents require different compression techniques, those techniques in FUIF correspond to internal image transformations you can apply within the same format.
  • No arbitrary limitations
    • FUIF has no inherent limitations in terms of maximum image resolution, bit depth, or number of channels. Obviously, a practical implementation must impose limits (if only for security and performance reasons), but the format itself has no arbitrary limitations. Hence, at some point, if you want to do a 20-bit CMYK color model with an alpha channel and a depth channel, FUIF would fit the bill.
  • Any quality
    • From very low bitrates to lossless, FUIF aims to be at the Pareto front of compression density for a given perceptual (or objective) quality.
  • Any computational complexity
    • For one-to-many delivery, it is often OK to spend a lot of time to encode an image. In other use cases, a fast and predictable encode time is desired. FUIF’s modular approach in both its internal image transformations and entropy coding makes it possible to obtain a wide spectrum of trade-offs between encode time and compression density.

Besides FUIF being a free and universal image format, three other design principles guided its development:

  • First, as discussed in my previous post, FUIF is responsive by design. You need only one single file instead of having to downscale a high-resolution original to various target resolutions. FUIF has a minimalistic, compact header layout, so the first bits of actual image data appear as early as possible. That means you can produce an LQIP module within a small byte budget from the first few bytes of the full image.

    Furthermore, a FUIF file header contains a mandatory list of truncation offsets, from which you can easily figure out how many bytes to request or serve. In the context of a web page, it would make sense to embed the first few hundred bytes of each image in the HTML code so as to display LQIP images without additional HTTP requests. At the same time, the browser would have the actual image dimensions and truncation offsets for sending an HTTP range request for the extra bytes, if and when necessary.

  • Second, FUIF is legacy friendly, which I will explain in detail in my next post.

  • Third, the compression artifacts of FUIF are “honest”. That concept will also be a topic of a future post.

I hope the first two posts of this series on FUIF have helped you understand the reasons why we need yet another image file format and the behind-the-scenes design principles. Stay tuned for the upcoming posts on legacy friendliness and compression artifacts.

The FUIF Code has now been made public.

Further Reading on Responsive Images

Recent Blog Posts

The Benefits of Headless DAMs

Headless is not a buzzword anymore. In fact, the concept of headless architecture is gaining momentum due to the flexibility it offers for composing new experiences and for tackling the undue complexity of an ever-evolving technology stack. That’s because while the evolution of the martech landscape has enabled disruptive, digital innovations, the approach of buying point solutions for solving specific challenges can expose companies to the complicated nature of new technologies, systems, and platforms.

Read more

Building Display Ads With Transparent Video

By Afzaal Ahmad Zeeshan
Build Web Ads With Transparent Video to Attract User Engagement

Billions of views on the Internet every day drive one of the biggest industries on the planet: advertising. The sheer size of that market and the competitive nature of vying for consumer attention results in a constant need for innovation. Readers are jaded, and display ads are blind spots.

Read more
How Cloudinary's Media Optimizer Helps E-Businesses Deliver Superior Web Performance

As a technology company, Cloudinary owes its success to its ability to build solutions that address the most critical challenges you, our customers, face. The companies we serve run the gamut of digital businesses—retailers and direct-to-consumer brands, media and entertainment, travel and hospitality—which, coincidentally, all care about the same things.

Read more
How to Auto-Tag Video With Markers on Cloudinary for Easy Navigation

A picture is worth a thousand words, and that also holds true for video, one minute of which, according to Dr. James McQuivey of Forrester Research, is worth 1.8 million words. That's why online stores rely on rich media to promote products and sales. Images and videos impart a real sense of involvement with a purchase—a car, a vacation getaway, an apartment rental—setting your business apart from the competition.

Read more