Cloudinary Blog

Evolution of <img>: Gif without the GIF

Embed MP4 in HTML <img> Tags for Improved GIF-Like Experience

TL;DR

  • Gifs are awesome but terrible for quality and performance
  • Replacing Gifs with <video> is /better/ but has perf. drawbacks: not preloaded, uses range requests
  • Now you can <img src=".mp4">s in Safari Technology Preview
  • Early results show mp4s in <img> tags display 20x faster and decode 7x faster than the GIF equivalent - in addition to being 1/14th the file size!
  • Background CSS video & Responsive Video can now be a “thing”.
  • Finally - cinemagraphs without the downsides of Gifs! Now we wait for the other browsers to catch-up

Intro

I both Ode to Geocities love and Thanks Tim Kadlec hate animated Gifs.

Safari Tech Preview has changed all of this. Now I love and love animated “Gifs”.

Everybody loves animated Gifs!

Animated Gifs are a hack. To quote from the original Gif89a specification:

The Graphics Interchange Format is not intended as a platform for animation, even though it can be done in a limited way.

But they have become an awesome tool for cinemagraphs, memes, and creative expression. All of this awesomeness, however, comes at a cost. Animated Gifs are terrible for web performance. They are HUGE in size, impact cellular data bills, require more CPU and memory, cause repaints, and are battery killers. Typically Gifs are 12x larger files than H.264 videos, and take 2x the energy to load and display in a browser. And we’re spending all of those resources on something that doesn’t even look very good – the GIF 256 color limitation often makes GIF files look terrible (although there are some cool workarounds).

My daughter loves them – but she doesn't understand why her battery is always dead.

Gifs have many advantages: they are requested immediately by the browser preloader, they play and loop automatically, and they are silent! Implicitly they are also shorter. Market research has shown that users have higher engagement with, and generally prefer both micro-form video (< 1minute) and cinemagraphs (stills with subtle movement), over longer-form videos and still images. Animated Gifs are great for user experience.

videos that are <30s have highest conversion

So how did I go from love/hating Gifs to love/loving “Gifs”?

In the latest Safari Tech Preview, thanks to some hard work by Jer Noble, we can now use MP4 files in <img> tags. The intended use case is not long-form video, but micro-form, muted, looping video – just like animated Gifs. Take a look for yourself:

<img src=”rocky.mp4”>

Rocky!

Cool! This is going to be awesome on so many fronts – for business, for usability, and particularly for web performance!

... but we already have <video> tags?

As many have already pointed out, using the <video> tag is much better for performance than using animated Gifs. That’s why in 2014 Twitter famously added animated GIF support by not adding GIF support. Twitter instead transcodes Gifs to MP4s on-the-fly, and delivers them inside <video> tags. Since all browsers now support H.264, this was a very easy transition.

<video autoplay loop muted inline>
  <source src="eye-of-the-tiger-video.webm" type="video/webm">
  <source src="eye-of-the-tiger-video.mp4" type="video/mp4">
  <img src="eye-of-the-tiger-fallback.gif" />
</video>

Transcoding animated Gifs to MP4 is fairly straightforward. You just need to run ffmpeg -i source.gif output.mp4

However, not everyone can overhaul their CMS and convert <img> to <video>. Even if you can, there are three problems with this method of delivering Gif-like, micro-form video:

  1. Browser performance is slow with <video>

    As Doug Sillars recently pointed out in a HTTP Archive post, there is huge visual presentation performance penalty when using the <video> tag.

    Sites without video load about 28 percent faster than sites with video

    Unlike <img> tags, browsers do not preload <video> content. Generally preloaders only preload JavaScript, CSS, and image resources because they are critical for the page layout. Since <video> content can be any length – from micro-form to long-form – <video> tags are skipped until the main thread is ready to parse their content. This delays the loading of <video> content by many hundreds of milliseconds.

    For example, the hero video at the top of the Velocity conference page is only requested 5 full seconds into the page load. It’s the 27th requested resource and it isn’t even requested until after Start Render, after webfonts are loaded.

    Worse yet, many browsers assume that <video> tags contain long-form content. Instead of downloading the whole video file at once, which would waste your cell data plan in cases where you do not end up watching the whole video, the browser will first perform a 1-byte request to test if the server supports HTTP Range Requests. Then it will follow with multiple range requests in various chunk sizes to ensure that the video is adequately (but not over-) buffered. The consequence is multiple TCP round trips before the browser can even start to decode the content and significant delays before the user sees anything. On high-latency cellular connections, these round trips can set video loads back by hundreds or thousands of milliseconds.

    And what performs even worse than the native <video> element? The typical JavaScript video player. Often, the easiest way to embed a video on a site is to use a hosted service like YouTube or Vimeo and avoid the complexities of video encoding, hosting, and UX. This is normally a great idea, but for micro-form video, or critical content like hero videos, it just adds to the delay because of the JavaScript players and supporting resources that these hosting services inject (css/js/jpg/woff). In addition to the <video> markup you are forcing the browser to download, evaluate, and execute the JavaScript player -- and only then can the video start to load.

    As many people know, I love my Loki jacket because of its built in mitts, balaclava, and a hood that is sized for helmets. But take a look at the Loki USA homepage – which uses a great hero-video, hosted on Vimeo:

    lokiusa.com filmstrip

    lokiusa.com video

    If you look closely, you can see that the JavaScript for the player is actually requested soon after DOM Complete. But it isn’t fully loaded and ready to start the video stream until much later.

    lokiusa.com waterfall

    Check out the WPT Results

  2. You can’t right click and save video

    Most long-form video content – vlogs, TV, movies – is delivered via JavaScript-based players. Usually these players provide users with a convenient “share now” link or bookmark tool, so they can come back to YouTube (or wherever) and find the video again. In contrast, micro-form content – like memes and cinemagraphs – usually doesn’t come via a player, and users expect to be able to download animated Gifs and send them to friends, like they can with any image on the web. That meme of the dancing cat was sooo funny – I have to share it with all my friends!

    If you use <video> tags to deliver micro-form video, users can't right-click, click-and-drag, or force touch, and save. And their dancing-cat joy becomes a frustrating UX surprise.

  3. Autoplay abuse

    Finally, using <video> tags and MP4s instead of <img> tags and GIFs is brings you into the middle of an ongoing cat and mouse game between browsers and unconscionable ad vendors, who abuse the <video autoplay> attribute in order to get the users’ attention. Historically, mobile browsers have ignored the autoplay attribute and/or refused to play videos inline, requiring them to go full screen. Over the last couple of years, Apple and Google have both relaxed their restrictions on inline, autoplay videos, allowing for Gif-like experiences with the <video> tag. But again, ad networks have abused this, causing further restrictions: if you want to autoplay <video> tags you need to mark the content with muted or remove the audio track all together.

... but we already have animated WebP! And animated PNG!

The GIF format isn’t the only animation-capable, still-image format. WebP and PNG have animation support, too. But, like GIF, they were not designed for animation and result in much larger files, compared to dedicated video codecs like H.264, H.265, VP9, and AV1.

Animated PNG is now widely supported across all browsers, and while it addresses the color pallete limitation of GIF, it is still an inefficient file format for compressing video.

Animated WebP is better, but compared to true video formats, it’s still problematic. Aside from not having a formal standard, animated WebP lacks chroma subsampling and wide-gamut support. Further, the ecosystem of support is fragmented. Not even all versions of Android, Chrome, and Opera support animated WebP – even though those browsers advertise support with the Accept: image/webp. You need Chrome 42, Opera 15+ or Android 5+.

So while animated WebPs compress much better than animated GIFs or aPNGs, we can do better. (See file size comparisons below)

Having our cake and eating it too

By enabling true video formats (like MP4) to be included in <img> tags, Safari Technology Preview has fixed these performance and UX problems. Now, our micro-form videos can be small and efficient (like MP4s delivered via the <video> tag) and they can can be easily preloaded, autoplayed, and shared (like our old friend, the animated GIF).

<img src="ottawa-river.mp4">

So how much faster is this going to be? Pull up the developer tools and see the difference in Safari Technology Preview and other browsers:

Take a look at this!

Unfortunately Safari doesn’t play nice with WebPageTest, and creating reliable benchmark tests is complicated. Likewise, Tech Preview’s usage is fairly low, so comparing performance with RUM tools is not yet practical.

We can, however, do two things. First, compare raw byte sizes, and second, use the Image.decode() promise to measure the device impact of different resources.

Byte Savings

First, the byte size savings. To compare this I took the trending top 100 animated GIFs from giphy.com and converted them into VP8, VP9, WebP, H.264, and H.265.

NB: These results should be taken as directional only! Each codec could be tuned much more; as you can see the default VP9 encoding settings fair worse, here, than the default VP8 outputs. A more comprehensive study should be done that considers visual quality, measured by SSIM.

Below are the median (p50) results of the conversion:

Format Bytes p50 % change p50
GIF 1,713KB
WebP 310KB -81%
WebM/VP8 57KB -97%
WebM/VP9 66KB -96%
WebM/AV1 TBD
MP4/H.264 102KB -93%
MP4/H.265 43KB -97%

So, yes, an animated WebP will almost always be smaller than an animated GIF – but any video format will be much, much smaller. This shouldn’t surprise anyone since modern video codecs are highly optimized for online video streaming. H.265 fairs very well, and we should expect the upcoming AV1 to fair well, too.

The benefits here will not only be faster transit but also substantial data-plan cost savings for end users.

Net-net, using video in <img> tags is going to be far, far better for users on cellular connections.

Decode and Visual Performance Improvements

Next, let’s consider the impact that decoding and displaying micro-form videos has on the browsing experience. H.264 (and H.265) has the notable advantage of being hardware decoded instead of using the primary core for decode.

How can we measure this? Since browsers haven’t yet implemented the proposed hero image API, we can use Steve Souder’s User Timing and Custom Metric strategy as a good aproximation of when the image starts to display to the user. This strategy doesn’t measure frame rate, but it does tell us roughly when the first frame is displayed. Better yet, we can also use the newly adopted Image.decode() event promise to measure decode performance. In the test page below, I inject a unique GIF and MP4 in an <img> tag 100 times and compare the decode and paint performance.

let image = new Image;
t_startReq = new Date().getTime();
document.getElementById("testimg").appendChild(image);
image.onload = timeOnLoad;
image.src = src;
return image.decode().then(() => { resolve(image); });

The results are quite impressive! Even on my powerful 2017 MacBook Pro, running the test locally, with no network throttling, I can see GIFs taking 20x longer than MP4s to draw the first frame (signaled by the onload event), and 7x longer to decode!

Local test on powerful MacBook Pro

Curious? Clone the repo and test for yourself. I will note that adding network conditions on the transit of the GIF v. MP4 will disproportionately skew the test results. Specifically: since decode can start happening before the last byte finishes, the delta between transfer, display and decode becomes much smaller. What this really tells us is that just the byte savings alone will substantially improve the user experience. However, factoring out the network as I’ve done on a localhost run, you can see that using video has substantial performance benefits for energy consumption as well.

How can you implement this?

So now that Safari Technology Preview supports this design pattern, how can you actually take advantage of it, without serving broken images to non-supporting browsers? Good news! It's relatively easy.

Option 1: Use Responsive Images The simplest way is to use the <source type> attribute of the HTML5 <picture> tag.

<picture>
  <source type=”video/mp4” srcset=”cats.mp4”>
  <source type=”image/webp” srcset=”cats.webp”>
  <img src=”cats.gif”>
</picture>

I’d like to say we can stop there. However, there is this nasty WebKit bug in Safari that causes the preloader to download the first <source> regardless of the MIME type declaration. The main DOM loader realizes the error and selects the correct one. However, the damage will be done. The preloader squanders its opportunity to download the image early and on top of that, starts downloading the wrong version, wasting bytes. The good news is that I’ve patched this bug and the patch should land in Safari TP 45.

In short, using the <picture> and <source type> for MIME type selection is not advisable until the next version of Safari reaches 90%+ of Safari’s total user base.

Option 2: Use MP4, animated WebP and Fallback to GIF

If you don't want to change your HTML markup, you can use HTTP to send MP4s to Safari with content negotiation. In order to do so, you must generate multiple copies of your cinemagraphs (just like before) and Varyresponses based on both the Accept and User-Agent headers.

This will get a bit cleaner once WebKit BUG 179178 is resolved and you can add a test for the Accept: video/* header, (the same way that you can test for Accept: image/webp, now). But the end result is that each browser gets the best format for <img>-based micro-form videos that it supports:

Browser Accept Header Response
Safari TP41+ H.264 MP4
Accept: video/mp4 H.264 MP4
Chrome 42+ Accept: image/webp aWebP
Opera 15 Accept: image/webp aWebP
Accept: image/apng aPNG
Default GIF

In nginx this would look something like:

if ($http_user_agent ~* "Safari/605[.0-9]+$") {
   rewrite ^/(.*)$ http://www.domain2.com/$1 permanent;
}

map $http_user_agent $mp4_suffix {
    default   "";
    “~*Safari/605”  ".mp4";
  }

location ~* .(gif)$ {
      add_header Vary Accept;
      try_files $uri$mp4_suffix $uri =404;
    }

Of course, don't forget the Vary: Accept, User-Agent to tell coffee-shop proxies and your CDN to cache each response differently. In fact, you should probably mark the Cache-Control as private and use TLS to ensure that the less sophisticated ISP Performance-Enhancing-Proxies don't cache the content.

GET /example.gif HTTP/1.1
Accept: image/png; video/*; */*
User-Agent: User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/605.1.13 (KHTML, like Gecko) Version/11.1 Safari/605.1.13

…

HTTP/1.1 200 OK
Content-Type: video/mp4
Content-Length: 22378567
Vary: Accept, User-Agent

Option 3: Use RESS and fall Back to

If you can manipulate your HTML, you can adopt the Responsive-Server-Side (RESS) technique. This option moves the browser detection logic into your HTML output.

For example, you could do it like this with PHP:

<?php if(strlen(strstr($_SERVER['HTTP_USER_AGENT'],"Safari/605")) <= 0 ){ // if not firefox ?>
<img src=example.mp4>
<?php } else {?>
<img src=example.gif>
<?php }?>

As above, be sure to emit a Vary: User-Agent response to inform your CDN that there are different versions of your HTML to cache. Some CDNs automatically honour the Vary headers while others can support this with a simple update to the CDN configuration.

Bonus: Don’t forget to remove the audio track

Now, since you aren’t converting GIF to MP4s but rather you are converting MP4s to GIFs, we should also remember to strip the audio track for extra byte savings. (Please tell me you aren’t using GIFs as your originals. Right?!) Audio tracks add extra bytes that we can quickly strip off since we know that our videos will be played on mute anyway. The simplest way to do this with ffmpeg is:

ffmpeg -i cats.mp4 -vcodec copy -an cats.mp4

Are there size limits?

As I’m writing this, Safari will blindly download whatever video you specify in the <img> tag, no matter how long it is. On the one hand, this is expected because it helps improve the performance of the browser. Yet, this can be deadly if you push down a 120-minute video to the user. I've tested multiple sizes and all were downloaded as long as the user hung around. So, be courteous to your users. If you want to push longer-form video content, use the <video> tag for better performance.

What's next? Responsive video and hero backgrounds

Now that we can deliver MP4s via <img> tags, doors are opening to many new use cases. Two that come to mind: responsive video, and background videos. Now that we can put MP4s in srcsets, vary our responses for them using Client Hints and Content-DPR, and art direct them with <picture media>, well – think of the possibilities!

<img src="cat.mp4" alt="cat"
  srcset="cat-160.mp4 160w, cat-320.mp4 320w, cat-640.mp4 640w, cat-1280.mp4 1280w"
  sizes="(max-width: 480px) 100vw, (max-width: 900px) 33vw, 254px">

Video in CSS background-image: url(.mp4) works, too!

<div style=”width:800px, height: 200px, background-image:url(colin.mp4)”/>

Conclusion

By enabling video content in <img> tags, Safari Technology Preview is paving the way for awesome GIF-like experiences, without the terrible performance and quality costs associated with GIF files. This functionality will be fantastic for users, developers, designers, and the web. Besides the enormous performance wins that this change enables, it opens up many new use cases that media and ecommerce businesses have been yearning to implement for years. Here’s hoping the other browsers will soon follow. Google? Microsoft? Mozilla? Samsung? Your move!

This was originally posted on Performance Calendar

Recent Blog Posts

Integrating Cloudinary with Forestry’s Media Library

At Forestry, we believe that there is a bright future for static HTML sites built with tools like Jekyll and Hugo. These tools can create sites that run well, and are easy to host and maintain, because they don’t require any server-side code.

Read more
Video Optimization With the HTML5 <video> Player

Lack of experience and compression knowhow can cause significant user-experience problems. For instance, on a major retail site, I recently ran into a 48 MB video-hero banner. Pulling out the video and encoding it as an H.264 MP4 reduces the size to 1.9 MB. So, despite the desire for more video content, developers have not yet caught up to best practices. How do we get the best of both worlds without creating a disaster like the one above?

Read more
Build a Facial Emotion Recognition Based Video Suggestion App

Developers are always looking for new and creative ways to deliver content that resonates with the way users feel. Often using the latest technical innovations the market has to offer such as Artificial Intelligence (AI) and Machine Learning (ML). What better way to demonstrate innovative uses of these technology in a consumer market than capturing expressions from your users and then serving content based on that expression!

Read more
Improve Customer Data Protection with GDPR Implementation

TL;DR

Yay! We've done it! Gold-Star for us! We've talked with all the people, made all the changes, paid all the lawyers and checked all the boxes. GDPR? ✅Done!

Not so fast. Of course, conforming to the GDPR regulations introduced in Europe is just the beginning. This is a process and a state of mind that must become part of our long-term cultural ethos.

Read more
Magento Image and Video Optimization

As the number of channels and devices continues to grow, it is becoming much more challenging to deliver an optimal visual experience. On an eCommerce site, engagement and conversions are critical. Factors such as page load time strongly influence search engine rankings, shopping experiences, conversion rates and, ultimately, your revenue. An optimized visual experience positively affects conversion rates. But it also can introduce resource bottlenecks, as every image and video needs to be delivered in the most efficient format, quality and resolution, based on the viewing device.

Read more