Cloudinary is a performance-focused company, so when my Cloud Four colleagues and I were asked to rebuild their blog using WordPress, we also set our sights on improving the blog’s performance.
The rebuilt blog was significantly faster than the old one, but there was still room for improvement. We performed multiple rounds of performance analysis and optimization. The results? We improved our median Lighthouse performance score from 78 to 88:
Many factors influence web performance and it can be hard to decide which metric to start with. For a content-based site like the Cloudinary blog, the most important concern is site load speed. Google’s Web Vitals metrics are a helpful tool for approximating and understanding the page-loading experience for readers.
Google splits its Web Vitals into two main categories:
Core Web Vitals are the most important for end users and affect search rankings.
Other Web Vitals measure other performance characteristics that significantly impact the Core Web Vitals.
We used a few different tools to determine how the Cloudinary blog was performing:
- Page Speed Insights allows you to measure Web Vitals and experience what actual Chrome users are experiencing. Note: User metrics are tracked over 28 days, so it takes nearly a month to see the impact of changes on real users.
- Lighthouse is an open-source, automated tool that can be run locally using Chrome to diagnose issues with web pages. You can run it against any web page, public or requiring authentication. It has audits for performance, accessibility, progressive web apps, SEO, and more. (Be aware that its output is affected by the speed of your device, and you can only test one URL at a time.)
- Lighthouse Parade is an open-source tool built by Cloud Four that allows you to run Lighthouse on every page of your website and review the results in a spreadsheet.
- WebPageTest digs deeper into the performance of a single page and determines what is affecting page loading speed.
To assess the pre-optimized blog’s performance, identify issues, and track improvements, we followed a process of measurement, analysis, and experimentation:
- Run Lighthouse Parade to gauge the site’s overall performance.
- Identify low-performing pages to analyze more thoroughly using WebPageTest.
- Create a hypothesis for how these pages’ performance could be improved.
- Deploy the required changes.
- Run Lighthouse Parade again to measure the impact of our changes.
- Review the “Real User Metrics” exposed by Google to check for issues that Lighthouse may not expose.
- Repeat from Step 2.
Lighthouse and Lighthouse Parade scores can differ significantly based on how powerful the computer running the performance analysis is. More powerful computers are likely to get higher scores than less powerful computers.
When comparing Lighthouse Parade tests over time, it’s crucial to run the tests from the same device to measure the impact of your changes without the metrics being influenced by the different devices.
Lighthouse tests run on powerful devices may also not catch performance issues that are present on less powerful devices.
For this article, we used a 2020 MacBook Pro. Results may not be representative of the average end-user experience.
Time to first byte (TTFB) measures how long it takes for a web server to return the first byte of a requested web page. This is an important metric because it affects all other web vitals. The browser can’t start building the web page or downloading other resources until the server responds.
We had a few things going for us when it came to TTFB:
- Our content was being served from a dedicated hosting environment.
- Our content was being distributed by a global CDN.
However, we noticed some unusually slow TTFB scores when we tested Cloudinary’s updated blog pages.
We dug into this to find ways to optimize TTFB speed:
- Since it’s a blog, the content doesn’t change often and can be cached for a long time. However, our host had a default page cache duration of five minutes. We updated this so that pages are cached for longer. This improves the chance that a page is served from the CDN cache instead of being rebuilt on the server.
- Some of our pages rely on external API calls to generate responsive breakpoints or Cloudinary SDK snippets. We were already caching the results of these API calls, but we increased the cache duration to reduce how often those calls were made.
- We’re using Timber to render Twig templates to generate our pages. We enabled Timber’s caching options to avoid re-running that logic more than necessary.
- We used PHP performance analysis tools to identify slow code on our listings pages and are in the process of optimizing this code.
These changes made big improvements to our TTFB speed and helped us to pass our core web vitals metrics.
The next metric we focused on was Largest Contentful Paint (LCP). LCP measures how long it takes to render the largest element in the user’s viewport. Or, how long does it take before users can view the most important content on the page?
For the Cloudinary blog, the LCP image is typically a blog post’s cover image, so we needed to optimize image load time. Again, we already had a few things going for us:
- The images were hosted via Cloudinary, so we could easily optimize and resize images, serve images in modern image formats, and distribute the images via Cloudinary’s global CDN.
- We were using responsive images to ensure there were appropriate image sizes for all devices and that users weren’t downloading giant images for tiny screens.
There were further opportunities for us to improve the LCP speed. In addition to the size and delivery of the cover images, there were a few other things impacting LCP:
- The browser has to read the HTML response to find the URL for the LCP image, and it may not know to prioritize loading that image.
- We used
preconnectto instruct browsers to begin connecting to the image CDN and downloading the LCP image earlier.
- We stored this information in an HTTP “Link” header so the browser would receive this information before it started reading the HTML response.
- We added
fetchpriority="high"to the LCP image markup to tell browsers to prioritize downloading it.
- In the future, we’d like to use early hints to prompt the browser to download the image sooner.
- We used
By avoiding render-blocking assets and instructing the browser to prioritize downloading our LCP image, we were able to drastically improve our LCP speeds. Our average LCP Lighthouse score increased from 70 to 77. (The median score increased from 85 to 91.)
First Input Delay (FID) represents how long it takes for a web page to respond to a user’s first interaction. For example, how long does it take for the page to respond if a user clicks a link or button?
FID requires a user to interact with the page, so it can only be extrapolated from real user data. You can’t test FID with automated tools. However, there are a couple of metrics that can be tested automatically and serve as helpful stand-ins:
- Max Potential First Input Delay measures the worst-case input delay for users. If you can improve the worst-case scenario, you’re likely improving real user experiences.
- Total Blocking Time measures how long the main thread is blocked during page loading.
We reviewed some heavy third-party libraries we were embedding on the site to see if we could reduce their impact:
- We were using a Google Custom Search integration to power a web search across all of Cloudinary’s websites. This was loaded on every page. We switched to only loading it on a special search page.
- We’re using Google Tag Manager to allow the marketing team to embed scripts. We’re able to run this on a worker thread to avoid blocking the main thread using a tool called Partytown. (We’ve temporarily disabled this on the site while we set up a proxy for a cookie consent service.)
We were hopeful that these changes would make a big impact on FID. But, when we reviewed our real-user data using the Google Search Console, it told another story: Our average FID was over 200 milliseconds on mobile devices. This meant a portion of our mobile users were experiencing sub-par performance. Resolving this turned out to be a long, and sometimes frustrating process.
In general, the first step in understanding a performance problem is being able to reproduce it yourself. From there you can test changes and quickly determine whether they resolve the issue for you. But, try as we might, we could not reproduce the slow FID times ourselves, even when testing on a wide variety of devices.
Our only insight into the slow FID scores was Google’s Real User Metrics data, but this was frustratingly short on details. We added the web-vitals library in an attempt to glean more information. By combining this with our analytics we could get more information about who was experiencing slow FID scores and on what pages.
We learned that our FID average score was being skewed by a handful of very long FID times. Some users were experiencing over a second of input delay. We also learned that these users tended to be on older, less powerful devices in areas with slower network speeds. But our data didn’t tell us why they were experiencing input delays, or what actions were triggering them. We were back where we started, without a clear path forward.
We knew something on the blog was causing input delays and should be optimized or removed, but it wasn’t clear what. We did a number of trial-and-error experiments, optimizing different areas of the site and waiting to see their impact. We optimized how CodePens were loaded, changed how scripts were loaded, and tweaked many other aspects of the site, but nothing moved the needle.
We brainstormed with our friends at Cloudinary and they suggested taking a closer look at event listeners on the site. (Thanks Nadin!) We zeroed in on a couple areas of our site that added touch or scroll event listeners. We removed a third-party library that added touch event listeners and refactored some custom code that relied on listening for scroll events.
Finally, some good news! After deploying these changes our average mobile FID score dropped from over 200 milliseconds, down to around 50! FID can be frustrating to debug since it can’t be tested using tools like Lighthouse, and is often device-specific. But, it’s a helpful metric to let us know what real users are experiencing.
We saw significant improvements in our web vital scores by dedicating time and resources to improving them. Using tools like Lighthouse Parade and WebPageTest allowed us to identify opportunities for improvement, make changes, and track our progress. Focusing on web vitals helped us understand both the experience our blog readers have and how technical choices impact their experiences. There’s still more we plan to do to improve performance because everyone deserves a great experience.