A couple of months ago while taking a break from implementing cool new features like q_auto and g_auto, I was joking in our team chat about how well various image formats “compress” one-pixel images. In response, Orly — who runs the blog — asked me if I’d write a post about single-pixel images. I said: “Sure, why not. But it will be a very short blog post. After all, there’s not much you can say about a single pixel.”
Looks like I was wrong. Very wrong.
Back in the early days of the web, one-pixel images were widely used as a poor man’s solution to do things we now do with CSS. Spacing, creating lines or rectangles, semi-transparent backgrounds: there’s quite a lot you can do by simply scaling one pixel to arbitrary dimensions. Another use of one-pixel images, still a common practice today, is as a web beacon, for tracking or analytics.
In responsive web design, one-pixel images are often used as temporary placeholders while the page is loading. Since most browsers do not support client-hints, some responsive image solutions wait for the page to fully load in order to determine the actual rendered image sizes, and then replace a one-pixel image with the right breakpoint image using JavaScript.
There is one other use of single-pixel images: they can be used as ‘default’ images. If for whatever reason the actual image that you want to show cannot be found, it might in some cases be better to hide that fact (by showing one transparent pixel) than to return a “404 – Not Found” error, which will usually be rendered by browsers as a “broken image” icon. In both cases, you don’t get to see the intended image, but it might look a bit more professional if you don’t ‘rub it in’ by showing a broken image icon.
OK, it looks like one-pixel images do have some uses. So, what’s the best way to encode a 1×1 image?
Obviously, this is a fringe case for image compression formats. If the “image” only consists of a single pixel, there sure is not a lot of data to compress. In fact, the uncompressed data is just one bit to four bytes – depending on how you interpret the data: black & white (1 bit), grayscale (1 byte), grayscale + alpha (2 bytes), RGB (3 bytes), or RGBA (4 bytes).
But you can’t encode just the data. In any image format, you need to specify how to interpret the data. At the very least, you need to know the width and height of the image, and the number of bits or bytes per pixel.
Typically, to encode the width and height, four bytes are used: two bytes per number (if it were only one byte, the maximum image dimension would be 255×255). Let’s say that we need another byte to encode the color type of the image (e.g. grayscale, RGB or RGBA). In this minimalistic image format, a single-pixel image would take at least 6 bytes (e.g. for a white pixel) and at most 9 bytes (for a semi-transparent, arbitrary color pixel).
However, actual image formats tend to have a “header” that contains quite a bit more information. First of all, the first few bytes of any image format contain a fixed identifier that is only there to say “Hey! I’m a file in this particular file format!”. This fixed sequence of bytes is also known as the magic number. For example, a GIF file always starts with either GIF87a
or GIF89a
(depending on which version of the GIF spec is used), a PNG file always starts with an 8-byte sequence that includes PNG
, JPEG files have a header that contains the string JFIF
or Exif
, and so on.
Headers can contain all sorts of meta-information about an image. Some of it is format-specific information to indicate what kind of subformat is used, and is necessary to decode the pixels correctly. Some of it might not be necessary to decode the pixels, but is still useful to know how to render them – e.g. color profiles, orientation, gamma, or dots-per-pixel. Some of it might be arbitrary metadata, like comments, timestamps, copyright notices, or GPS coordinates. These things might be optional, or they might be obligatory; it depends on the format specification. Of course all of this metadata has some cost in terms of file size. So let’s focus on “minimal” files, where all of the non-obligatory metadata has been stripped. Otherwise we might be wasting precious bytes on silly things.
Besides headers, image formats may have other kinds of “overhead”. They may contain all kinds of markers and checksums, intended to make the format more robust in case of transmission errors or other forms of corruption. Also, sometimes some kind of padding is required, to ensure that the data gets aligned properly.
One-pixel images – the smallest possible images – reveal exactly how much “overhead” there is in an image format. Let’s take a look.
Here is a hexdump of a 67-byte PNG file, representing a 1×1 white pixel:
00000000 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 |.PNG........IHDR|
00000010 00 00 00 01 00 00 00 01 01 00 00 00 00 37 6e f9 |.............7n.|
00000020 24 00 00 00 0a 49 44 41 54 78 01 63 68 00 00 00 |$....IDATx.ch...|
00000030 82 00 81 4c 17 d7 df 00 00 00 00 49 45 4e 44 ae |...L.......IEND.|
00000040 42 60 82 |B`.|
This file consists of the 8-byte PNG magic number, followed by a header chunk (IHDR) which contains 13 bytes, an image data chunk (IDAT) with 10 bytes of “compressed” image data, and an end marker (IEND). Every chunk starts with a 4-byte chunk length and a 4-byte chunk identifier and ends with a 4-byte chunk checksum, and these three chunks are obligatory, so that’s another 36 bytes, for a total file size of 67 bytes.
A black pixel is also 67 bytes in PNG; a fully transparent pixel is 68 bytes, and an arbitrary RGBA color will be between 67 and 70 bytes.
JPEG has a longer header. The smallest one-pixel JPEG is 160 bytes (Update: 141 bytes). And it cannot be transparent, because JPEG does not support an alpha channel.
GIF is the most compact (in terms of headers) amongst the three universally supported image formats. A white pixel can be encoded as a valid GIF file in just 35 bytes:
00000000 47 49 46 38 37 61 01 00 01 00 80 01 00 00 00 00 |GIF87a..........|
00000010 ff ff ff 2c 00 00 00 00 01 00 01 00 00 02 02 4c |...,...........L|
00000020 01 00 3b |..;|
and a fully transparent pixel can be done in 43 bytes:
00000000 47 49 46 38 39 61 01 00 01 00 80 01 00 00 00 00 |GIF89a..........|
00000010 ff ff ff 21 f9 04 01 0a 00 01 00 2c 00 00 00 00 |...!.......,....|
00000020 01 00 01 00 00 02 02 4c 01 00 3b |.......L..;|
Note that for all of the above formats, you can come up with even smaller files that will still decode to a one-pixel image in all or most browsers, but they are not valid with respect to the format specifications, which means that an image decoder might at any time complain (rightfully) that the file is corrupt, and show the broken image icon which we were trying to avoid.
So what’s the best format for a one-pixel image on the web? That depends. If it’s an opaque pixel, then the answer is GIF. If it’s a fully transparent pixel, then the answer is also GIF. But if it’s a semi-transparent pixel, then the answer is PNG, since GIF only supports all-or-nothing transparency.
Not that all of this matters very much. All of these files fit easily in a single network package, so in practice, there is no real speed difference – and the storage needed for this is negligible anyway. But still, it’s an amusing thing to look at, at least for image format geeks like me.
If you use WebP for one-pixel images, be sure to use lossless WebP. A single-pixel lossless WebP image is between 34 and 38 bytes. A single-pixel lossy WebP image is between 44 and 104 bytes, depending mostly on whether there’s an alpha channel or not. For example, this is a fully transparent pixel as a 34-byte lossless WebP:
00000000 52 49 46 46 1a 00 00 00 57 45 42 50 56 50 38 4c |RIFF....WEBPVP8L|
00000010 0d 00 00 00 2f 00 00 00 10 07 10 11 11 88 88 fe |..../...........|
00000020 07 00 |..|
and here is the same pixel as a lossy (default) WebP of 82 bytes:
00000000 52 49 46 46 4a 00 00 00 57 45 42 50 56 50 38 58 |RIFFJ...WEBPVP8X|
00000010 0a 00 00 00 10 00 00 00 00 00 00 00 00 00 41 4c |..............AL|
00000020 50 48 0b 00 00 00 01 07 10 11 11 88 88 fe 07 00 |PH..............|
00000030 00 00 56 50 38 20 18 00 00 00 30 01 00 9d 01 2a |..VP8 ....0....*|
00000040 01 00 01 00 02 00 34 25 a4 00 03 70 00 fe fb fd |......4%...p....|
00000050 50 00 |P.|
The main difference between the two, is that a lossy WebP with transparency is actually stored internally as two images, thrown together into one container file: one lossy image for the RGB values, and one lossless image for the alpha values.
For Bellard’s BPG format, which also has a lossless and a lossy mode, it’s the other way around. The lossy BPG encoding of a single white pixel is 31 bytes, the smallest we’ve seen so far:
00000000 42 50 47 fb 00 00 01 01 00 03 92 47 40 44 01 c1 |BPG........G@D..|
00000010 71 81 12 00 00 01 26 01 af c0 b6 20 bc b6 fc |q.....&.... ...|
The lossless BPG for the same white pixel is 59 bytes. However, a fully transparent pixel is 57 or 113 bytes as a lossy or lossless BPG, respectively. Interestingly, for a single white pixel, BPG wins versus WebP (31 byte BPG vs 38 byte WebP), but for a single transparent pixel, WebP wins versus BPG (34 byte WebP vs 57 byte BPG).
And then there’s FLIF. As the main creator of the Free Lossless Image Format, obviously I cannot forget about that one. Here’s a 15 byte FLIF file for one white pixel:
00000000 46 4c 49 46 31 31 00 01 00 01 18 44 c6 19 c3 |FLIF11.....D...|
And here’s a 14 byte file for a black pixel:
00000000 46 4c 49 46 31 31 00 01 00 01 1e 18 b7 ff |FLIF11........|
The black pixel file is one byte smaller because the number zero happens to compress better than the number 255. The header is pretty simple: the first four bytes are always “FLIF”, the next byte is a human-readable indication of the color and interlacing type. In this case it is “1”, which means we have just one color channel (i.e. it’s a grayscale image). The next byte indicates the color depth: “1” means one byte per channel. And the next four bytes are the image dimensions, in this case 0x0001 by 0x0001. The last four or five bytes are the actual compressed data.
One fully transparent pixel is also 14 bytes in FLIF:
00000000 46 4c 49 46 34 31 00 01 00 01 4f fd 72 80 |FLIF41....O.r.|
In this case, we have 4 color channels (RGBA) instead of just one. You might expect the data section to be longer in this file (after all, there are four times as many color channels), but that’s not the case: since the alpha value happens to be zero (it’s a fully transparent pixel), the RGB values are considered irrelevant so they don’t end up being encoded at all.
For an arbitrary RGBA color, the FLIF file can be up to 20 bytes.
OK, so FLIF is the clear winner in the “one pixel” category of some weird image encoding competition. If only this were an important thing to compete at 🙂
Actually, no. FLIF isn’t the winner. Remember the minimalistic (and non-existent) image format I mentioned in the beginning? The one that would encode single-pixel images in 6 to 9 bytes? Well that format doesn’t exist, so I suppose it doesn’t count. But there is an image format that does exist, and which gets quite close to that.
It’s called the Portable Bitmap format (PBM), and it’s an uncompressed image format from the 1980s. Here’s how you could encode a single white pixel as a PBM file in just 8 bytes:
00000000 50 31 0a 31 20 31 0a 30 |P1.1 1.0|
Actually, forget about the hexdump, this is a human-readable file format. You can open it in a text editor if you want (at least this particular subformat):
P1
1 1
0
The first line (“P1”) indicates that this is a black & white image. Not grayscale; there are only two colors: black (which confusingly gets the number 1) and white (0). The second line indicates the image dimensions. And then it’s just a whitespace-delimited list of numbers, one number per pixel. So in this case just the number 0.
If you need something other than pure white or black, you can use the PGM format to get one pixel in any other shade of gray in just 12 bytes, or the PPM format to get any RGB color in just 14 bytes. This is always smaller than the corresponding FLIF file (or any other compressed format, for that matter).
The traditional PNM family (PBM, PGM and PPM) does not support transparency. There is an extension of PNM though, called Portable Arbitrary Map (PAM), which does support images with transparency. Unfortunately for our current purposes, its syntax is quite a bit more verbose. The smallest valid PAM file that encodes a fully transparent pixel, is the following:
P7
WIDTH 1
HEIGHT 1
DEPTH 4
MAXVAL 1
TUPLTYPE RGB_ALPHA
ENDHDR
\0\0\0\0
On the last line there are four zero (NULL) bytes. The above file is 67 bytes. You might be tempted to use grayscale+alpha instead of RGBA, because that would save two bytes in the data section. But that results in a 71 byte file, since you have to change the TUPLTYPE
from RGB_ALPHA
to GRAYSCALE_ALPHA
. Oh and by the way, your image software might not like the use of MAXVAL 1
, so you might need to change that to MAXVAL 255
(which takes two more bytes).
So all in all, for one-pixel images, when there’s no transparency involved, PNM is the smallest (8 to 14 bytes for PNM vs 14 to 18 bytes for FLIF), but when there is transparency, FLIF is smallest (14 to 20 bytes for FLIF vs 67 to 69 bytes for PAM).
Here is a summary table that gives the (optimal) file sizes for various one-pixel images:
white |
black |
gray |
yellow #FFFF00 |
transparent |
semitransparent #1337BABE |
|
PNG |
67 |
67 |
67 |
69 |
68 |
70 |
GIF |
35 |
35 |
43 |
35 |
43 |
/ |
JPEG |
160 |
160 |
159 |
288 |
/ |
/ |
Lossy WebP |
44 |
44 |
44 |
64 |
82 |
92 |
Lossless WebP |
38 |
34 |
38 |
36 |
34 |
38 |
Lossy BPG |
31 |
31 |
29 |
36 |
57 |
62 |
Lossless BPG |
59 |
59 |
37 |
124 |
113 |
160 |
FLIF |
15 |
14 |
15 |
18 |
14 |
20 |
PNM/PAM |
8 |
8 |
12 |
14 |
67 |
69 |
It might seem a bit surprising that an uncompressed image format actually beats most of the compressed formats at this particular task. But it’s not that surprising if you think about it. One-pixel images are in a sense the worst-case scenario for image compression: they’re all headers and overhead, and very little data. And the very little data there is cannot really be compressed because compression depends on predictability, and how are you supposed to predict one single pixel?
In part two of this blog post I will discuss the other extreme. How well do extremely predictable single-color images perform in various formats? Stay tuned….
Update: Check out part two as well: A one-color image is worth two thousand words