Cloudinary Blog

FUIF: Why Do We Need a New Image File Format?

New Image File Format: FUIF: Why Do We Need a New Image Format

In my last post, I introduced FUIF, a new, free, and universal image format I’ve created. In this post and other follow-up pieces, I will explain the why, what, and how of FUIF.

Why Do We Need Another Image File Format?

Even though JPEG is still the most widely-used image file format on the web, it has limitations, especially the subset of the format that has been implemented in browsers and that has, therefore, become the de facto standard. Because JPEG has a relatively verbose header, it cannot be used (at least not as is) for low-quality image placeholders (LQIP), for which you need a budget of a few hundred bytes. JPEG cannot encode alpha channels (transparency); it is restricted to 8 bits per channel; and its entropy coding is no longer state of the art. Also, JPEG is not fully “responsive by design.” There is no easy way to find a file’s truncation offsets and it is limited to a 1:8 downscale (the DC coefficients). If you want to use the same file for an 8K UHD display (7,680 pixels wide) and for a smart watch (320 pixels wide), 1:8 is not enough. And finally, JPEG does not work well with nonphotographic images and cannot do fully lossless compression.

JPEG XR (WDP) and JPEG 2000 improve upon JPEG and take away some of its limitations. For example, they do support transparency and higher bit depth. Nonetheless, they’re still designed specifically for photographic images and have a rather long header.

As a result, existing JPEG formats are not really what we want. The JPEG Committee has launched a call for proposals for a next-generation image format. In the call, they specify “efficient coding of images with text and graphics” as a core requirement and “very low file size image coding” as a desirable requirement. I have submitted FUIF in response to that call.

Standards
Image source xkcd

WebP can be used for both photographic images (by means of its lossy, VP8-based mode) and nonphotographic images (by means of its lossless, PNG-like mode). However, it also has some important limitations. It is restricted to 8 bits and has obligatory chroma subsampling in its lossy mode. Also, because WebP does not have a progressive encoding, it’s is certainly not “responsive by design.”

Just like WebP, many other modern image formats are based on the intraframe encoding of a video codec. Better Portable Graphics (BPG) and High Efficiency Image File (HEIC) are both based on the High Efficiency Video Coding (HEVC) video codec; AV1 Image Format (AVIF) is based on AV1. They are excellent replacements for JPEG for cameras and phones. Thanks to tiling and hardware acceleration, fast encoding and decoding are possible with much better compression. But again, BPG, HEIC, and AVIF are not “responsive by design.” Furthermore, AV1 is an open and royalty-free video codec; HEVC is not, let alone that it’s very much patent encumbered.

What Are FUIF’s Design Principles?

As far as I know, FUIF is free in the sense that it is royalty free along with a free reference implementation that is based on open-source software.

Additionally, FUIF is an image format, not an image container like HEIF. FUIF can function as a stand-alone image file format but is minimalistic in that it just stores pixels and the minimal metadata must render those pixels. Any additional capability is delegated to the container format (e.g., HEIF), like layering, tiling, and geometric transformations, such as orientation, cropping, comments, Exif metadata, and annotations. This is a matter of separation of concerns and avoidance of duplication of functionality. Even though FUIF does support simple GIF-like animation, many animations are better encoded as a short video by taking advantage of interframe prediction, which is beyond the scope of image formats.

Finally, FUIF is a universal image format in the following ways:

  • Any kind of image
    • JPEG works great for photographic images. PNG is often better for nonphotographic images, like screenshots, illustrations, diagrams, cartoons, logos, and game graphics. It is not practical to require different image formats for different image content (or different subformats within a file format, e.g., lossy WebP and lossless WebP). No matter that different image contents require different compression techniques, those techniques in FUIF correspond to internal image transformations you can apply within the same format.
  • No arbitrary limitations
    • FUIF has no inherent limitations in terms of maximum image resolution, bit depth, or number of channels. Obviously, a practical implementation must impose limits (if only for security and performance reasons), but the format itself has no arbitrary limitations. Hence, at some point, if you want to do a 20-bit CMYK color model with an alpha channel and a depth channel, FUIF would fit the bill.
  • Any quality
    • From very low bitrates to lossless, FUIF aims to be at the Pareto front of compression density for a given perceptual (or objective) quality.
  • Any computational complexity
    • For one-to-many delivery, it is often OK to spend a lot of time to encode an image. In other use cases, a fast and predictable encode time is desired. FUIF’s modular approach in both its internal image transformations and entropy coding makes it possible to obtain a wide spectrum of trade-offs between encode time and compression density.

Besides FUIF being a free and universal image format, three other design principles guided its development:

  • First, as discussed in my previous post, FUIF is responsive by design. You need only one single file instead of having to downscale a high-resolution original to various target resolutions. FUIF has a minimalistic, compact header layout, so the first bits of actual image data appear as early as possible. That means you can produce an LQIP module within a small byte budget from the first few bytes of the full image.

    Furthermore, a FUIF file header contains a mandatory list of truncation offsets, from which you can easily figure out how many bytes to request or serve. In the context of a web page, it would make sense to embed the first few hundred bytes of each image in the HTML code so as to display LQIP images without additional HTTP requests. At the same time, the browser would have the actual image dimensions and truncation offsets for sending an HTTP range request for the extra bytes, if and when necessary.

  • Second, FUIF is legacy friendly, which I will explain in detail in my next post.

  • Third, the compression artifacts of FUIF are “honest”. That concept will also be a topic of a future post.

I hope the first two posts of this series on FUIF have helped you understand the reasons why we need yet another image file format and the behind-the-scenes design principles. Stay tuned for the upcoming posts on legacy friendliness and compression artifacts.

Note
The FUIF Code has now been made public.

Recent Blog Posts

The Evolution of the Role of Software Developers

Since the early 1990s, I’ve been watching with interest and intrigue the rapid evolution of the software industry and its developers, which ushered in the Internet age in the meantime. I’d like to share with you my observations on that subject. Because of the breadth and complexity of the related topics, I’ll chronicle them in a series of posts, starting with this one.

Read more
StubHub Finds Ticket to Effective Digital Asset Management with Cloudinary

As StubHub evolved from a lightweight website to one that was rich with images, videos and custom views from seat locations at event venues around the world, it sought a way to streamline image management. The company implemented Cloudinary, which enabled it to significantly reduce the time required to manually edit images, while ensuring that the key focal points of each image were viewable in more than 30 viewports across a multitude of devices. Cloudinary’s digital asset management capabilities also have helped StubHub better protect its content, creating a single repository that ensures developers and designers have access to only the most current images and understand the associated brand guidelines.

Read more
The Visual Media Report: Visual Engagement and User Experience

With privacy top of mind, we wondered what we might learn from analyzing the large volume of data. What user behaviors would we discover, what regional differences might exist? What insights or early hints from different industries could we extrapolate? These questions guided us as we analyzed millions of anonymous end-user experiences and asset interactions across our platform.

Read more