Reducing Image Delivery Bandwidth with Cloudinary
Join Grant Sander, Vice President of Engineering at Formidable, in this Cloudinary DevJams episode! You’ll learn how Grant migrated images to Cloudinary and experienced significant enhancements for their website, such as next-generation image format delivery and support.
Grant demonstrates the process and code that Formidable used to migrate images, resulting in an 86% reduction in image bandwidth for the site. Follow along with Cloudinary’s Customer Education team to gain a better understanding of the motivation and work involved in the process, from identifying the problem to measuring success.
If you’ve been storing your assets in a code repository and creating your own optimization processing pipeline, but finding the process to be challenging, this is a conversation you won’t want to miss.
Sam Brace: [00:00:00] Welcome to DevJams. This is where we talk with developers who are doing inspiring, innovative, interesting projects when it comes to images and videos that happen to be the most development projects, and they probably are likely using Cloudinary for all of that innovative work that they are doing. My name is Sam Brace. I am the Senior Director of Customer Education and Community for over all Cloudinary, and I’m very happy that you are here today for this DevJams episode. This is gonna be a fabulous one if you are interested in images, but also making them as lightweight as possible. And we’re gonna be talking a lot today about bandwidth and how to properly reduce that using functionality available with Cloudinary, APIs, SDKs, and our overall transformations.
Joining me for this episode is [00:01:00] Jen Brissman. She is a Technical Curriculum Engineer here at Cloudinary on our Customer Education team, and I am very happy to have here for this program. Jen, welcome to the show.
Jen Brissman: Hey, Sam. Thanks for having me.
Sam Brace: So Jen and I are, we’re gonna be talking to Grant Sander, who is the Vice President of Engineering at Formidable, which is a firm that’s focusing on design and development for some fairly large customers.
That includes probably a lot of the brands that you’re very familiar with, maybe you’re purchasing from every single day. The overall development and design work that’s happening from them, a lot of those aspects are being handled by Formidable and he’s gonna be talking about their own website, formidable.com, and they’re gonna be diving into some of the ways that they were able to reduce the bandwidth and then of course, associate that for some of the work that they’re doing for their clients.
But Jen, why is that exciting to you? What’s getting you fired up about this episode?
Jen Brissman: I’m excited to have Grant here. We’re so lucky to have him because he’s [00:02:00] working with Formidable, one of our partners at Cloudinary, and we don’t always get the chance to talk to people who we work with as well.
And they use us, we use them. So it’s cool to see how everything works together. And we’re gonna get into that with Grant today.
Sam Brace: I agree. And what’s also wonderful about this is that I think what we’re gonna also talk about is a very natural evolution that we’ve seen companies take when it comes to maturity of working with images and videos and digital media essentially, where sometimes you might start off, you’re saying an S3 bucket where we’re gonna store everything in terms of a place for it to go.
It’s nice, cloud-based, centralized area, that’s enough for what it is. But as time goes on, you need to start thinking about what formats are we delivering those things at? Is that really the right level of compression that we want everything to be at? Is serving everything as originals, actually hurting our overall performance, rather than providing some type of variation or derivative of that file so that way it loads as quickly as possible.
So what I love about what [00:03:00] Grant’s gonna talk with us here today about is something that probably a lot of developers have felt when they’ve gone through working with images and videos on the web.
Jen Brissman: Absolutely.
Sam Brace: So Jen, One thing that of course, that we wanna point out to everybody is the fact that we are in lots of places. So if you are saying, okay, great, I love the concept of being on this live stream. I love that Cloudinary is talking with developers. Well, note that you can always go to cloudinary.com/podcasts for all the previous episodes because we’ve been doing this for about two years now, talking with developers like Grant and others that are really pushing the boundaries of what images and videos are in the web, and of course, mobile as well. So you can always go to cloudinary.com/podcasts for all of that. And we of course want to emphasize our Cloudinary community. So if you hear some great conversations that are happening here and you want to keep those going, you can always pop over to Cloudinary. So community. So that’s gonna be community.cloudinary.com, [00:04:00] and that’s gonna be at the main space where you can see all of the content that happens to be there. So let’s go ahead and bring Grant on and have a little bit of detail about what he’s been doing at Formidable and being able to optimize that content. So Grant, welcome to the episode.
Grant Sander: Hello. Hello.
Sam Brace: Grant, obviously I’ve been very impressed with all the work that you’re doing, all the work that Formidable is doing, but maybe people that are listening to this, this is their first time encountering you. So tell us a little bit about you, a little bit about Formidable.
Grant Sander: Yeah, so maybe we start with Formidable. Formidable is a digital design and development agency. We do dev work, design work, product work for companies of varying sizes, anywhere from like smaller startups all the way up to, fortune 100 companies.
So that’s like at a high level what Formidable does. We help companies build cool things and help level up their engineers to, to build cool things. That’s Formidable. I am just another engineer I suppose. Recently, I am now a VP of engineering at Formidable, which is a scary title.
But yeah I basically help client teams succeed, help with like client facing work, but also involved in partnership things, and so like you mentioned, Formidable and Cloudinary are partners. We’re an agency partner of Cloudinary, so do a lot of like partnership engagement and also
sort of help lead the open source initiatives at Formidable? So Formidable does a lot of open source software work and I help with that build tooling [00:06:00] around. A lot of times it’s tooling around what helps our client team succeed, which is nice because if it’s helping our client teams succeed, it’s often helping other teams succeed.
So involved in a handful of different things at Formidable.
Sam Brace: Which is fantastic because it’s where you have a lot of concepts of what’s working for your clients that you’re working with, what’s working well for the various partners, as you mentioned, working with Cloudinary and others. And then even on top of that, being able to understand the brand presence.
So you have a lot of hands and lots of pots, but that’s a great place for us to be at with a guest like you. So this is fantastic that you’re involved with all of those different efforts. One thing I wanted to ask you about, because you mentioned some of the programming languages that you guys are working with at Formidable, and you mentioned React you also mentioned React Native.
Grant Sander: Yeah. In [00:07:00] consulting it depends on what business you can get, although it’s a chicken and egg problem where you’ve gotta have engineers to staff things.
And this predates my time at Formidable by a number of years, but Formidable is also early to React. I think, early engineers at Formidable kind of saw, saw that React was gonna be the next big thing. And so we, we did a lot of work in the open source community with React and helping some larger clients with React work that were also early adopters,
and we just rode that wave and then React Native came onto the scene and it’s been a, it’s been a great choice for a lot of companies allowing, allowing [00:08:00] us to help people build cross platform mobile apps, which is a pretty big deal. I mean ,that’s the selling point of React Native is you can have one engineering team building for both iOS and Android.
With React, there’s differences between web and mobile, but a lot of your React engineers that are doing web work can also get cross-trained onto mobile development as well. So those are the reason why, we’ve been big on React and React Native is that, that’s the thing we’ve been using,
and also just our, React’s got a big community. There’s a lot of client work to be done. A lot of larger companies are choosing React. They have been for quite a while. Reacts held the throne it seems like for quite a while now. We’re going where the market goes. We also like react quite a bit.
It’s a pretty pretty nice framework and makes our lives quite a bit easier. But in later years, TypeScript has been pretty big. So now we consider ourselves a TypeScript shop as well in terms of programming [00:09:00] languages. We do a lot of work with Typescripts, help clients migrate to Typescripts.
Which has been great. And we also have, we have engineers doing Rust and Go Lang as well. So we do some more additional languages. It’s not the bulk of our work. A lot of our backend work is still node. We do a lot of, like a lot in the GraphQL space. This is another space we’ve been pretty big in and early adopters is GraphQL.
Sam Brace: I love the fact that, you have all of these different viewpoints. You’re touching all of these different frameworks because it really does show that you’re able to provide a wealth of expertise to the [00:10:00] clients that you are working with. And that’s fabulous.
And of course, I love the fact that Cloudinary is an arrow in your overall quiver too, that you can bring that in to all the discussions, which kind of brings us to your project. So as we know you, we mentioned that you have Formidable, which is a overall agency, working with clients on all sorts of design and development efforts, but then you also have your main brand presence, so formidable.com.
Talk to me about what your decision making was with the website and why you decided to maybe go in a different route when it came to the ways that you were working with images for that site versus what you’re doing today.
Grant Sander: Yeah, so a number of years ago we had migrated our formidable.com, like our agency website to a stack of, we were using React Static, which was like, now considered an older framework, but it was a way for us to generate static webpages, which is great for an agency site like ours that doesn’t have a ton of dynamic content.
We’ll change things like [00:11:00] every week, every other week, but that’s like a pretty low frequency, for changes, and we were working with a stack of Ref Static. We were using Netlify CMS for allowing our like marketing team to author content in a, like a GUI, a visual interface. And then when they would make those changes, basically what would happen is it would commit those changes right into our git repository.
So our entire website was totally git driven at at the end of the day, and so this was pretty nice for us, in terms of the development teams we have, that’s a pretty natural flow for us. But when everything is static and it’s just like a straight up React site and everything is handled locally through the git repository, images become a sticking point because carrying around a ton of images, like all the images you need for like blog posts or case studies or whatever other kind of content you have on your site and like we don’t even really have user generated [00:12:00] images and we still had a pretty, pretty hefty load of images.
And it was a burden to try to have, say like our content authors or whoever’s generating images be very like in tune with like, how many megabytes is your image? What format is your image? And that’s a lot to ask of a content author to, to be like super, concerned with the nitty gritties of their images that they’re uploading.
So we’ve been migrating our site to try well React Static is, has been at end of life for a more than a year now. So we were in the process of migrating off of that. Netlify CMS has been great also slowing down in this development. And we were looking towards something, we’re moving towards sanity, which is another CMS that has been great for us and allows us to provide content.
So we’ve been talking about migration. Images, were like the biggest baggage in terms of like our existing flow, and sort of the lowest, [00:13:00] like lowest hanging fruit to some degree. It was like the most bang for buck in terms of our migration, which is great, especially because like most websites are very image heavy.
I think images make up, I dunno, we could probably Google the statistic, but the majority of like web bandwidth is just images. And optimizing images is a great first thing to do in terms of, saving bandwidth, improving speed, things like that.
Jen Brissman: So was there an actual problem or what was the actual problem that led you to look for an alternative in the way that your images were handled?
Were you experiencing any downsides to the way you were doing it before? And how did you know, okay, Cloudinary is our solution?
Grant Sander: Yeah, so we. When you have 1200 or 1500 images checked into your Git repository, like pulling the repo down, like cloning the repo, moving things around, it gets pretty heavy.
It gets chunky. You end up with just like gigabytes of images that are in there. [00:14:00] And we had a workflow for basically using Sharp, which is a a node library, a very high performance like image transformation processing library. And so we knew that we wanted to deliver sort of NextGen images, and so we had a process for like turning our original image assets into WebP versions, which WebP is like a, NextGen image format.
Although now I’m wondering if it’s still considered NextGen or not, there’s some other players in the space, but we could probably loop back around to that. But we had a process for taking just a straight up jpeg, turning it into a WebP and it would get committed right back into our repository so that, because we were serving those images out, so we were now, if you had 1500 source images, we really had 3000 because we were creating these WebP versions.
That weren’t being resized or compressed at all. They were like literally just converting the format of them. We were, we had a workflow for trying to [00:15:00] reduce some image bandwidth on the site by serving WebPs where like you could, where there’s browser support for, that’s pretty good.
But yeah we were running into this thing where our repo had a lot of baggage and managing our images and getting everything to process our original assets into WebP. It was just a very clunky workflow. And when we were looking towards I don’t know, resizing images or using something like avif, like more, more like efficient image formats it was gonna be daunting.
It was gonna be like the multiplicative effect of now you’re gonna need, for every source image, you’re gonna need four generated images stashed in there. We were looking at just like totally bogging down our get repository. Also just like realistically wasting engineering hours trying to create our own subpar version of image processing and delivery, which is just, we try to avoid that at Formidable.
We often advise our clients to pick the solution that [00:16:00] requires the least amount of dev effort, which is generally like good advice unless you’re in like a very specific edge case. And so we figured, it was time for us to take our own advice. We would, we would probably never tell our own clients to check all of their images and to get, we would be pushing them towards image hosting solutions.
And so it was just a matter of time for us to get caught up on our own site
Sam Brace: Well, I think it’s smart what you did, because it’s something that we hear continually when we start talking about headless architecture and composable architecture. A lot of the conversations around choosing what they call best of breed, right?
Which is where there’s lots of vendors that do something correctly. Don’t try to replicate the wheel, don’t try, do something that somebody already is spending in. Thousands, if not hundreds of thousands of R&D hours in to understand. So as you said, serving everything as a WebP. Yeah, that makes sense.
WebP is ultimately something that’s meant to supersede jpeg, PNG, et cetera. But in the same sense, now you’re able to deliver [00:17:00] things as AVIFS or JPEG excels or whatever you wanna be able to use. So it allows for people to be able to say, we got the format situation for you. It’s okay. You don’t have to send all the engineering hours on it.
So I like the approach that you ultimately took with this effort to know that when do we want to really make that investment versus when do we wanna go with someone that’s already made that investment for us. So that very good choice on your part. What I would love to talk about Grant, as you can see here, we have a whole blog post that is written on the formidable site, which we also have linked here in the show notes.
And it’s gonna outline a lot of the things that we’re gonna be talking about here today. But what I would love for us to be able to do is take a look at some of the codes, some of the details that you have here that really breaks down this particular sentence right here. The fact that you were able to do some things where you were able to reduce that image bandwidth by 86% across 30 of your most front facing web pages.
So I’m gonna pop over your screen here and we can start walking through some of the stuff that you’re able to do it from middle to make this [00:18:00] possible.
Grant Sander: Cool.
Yeah, I think we could step through this blog post a little bit. I also have these code snippets from the blog post pulled out into a GitHub gist, which I will probably jump over to cuz it’s a little easier to read here. And we have line numbers, but Just wanted to talk first about like big picture idea of okay, we wanna migrate to Cloudinary.
What was our mental model for executing a migration of all of our images on our site. So this little image here sums this up where we took a four step approach of, and that might be a little small, but the first step was upload our entire image library from our Git repository up to Cloudinary.
That’s pretty low stakes and honestly, I botched it a couple times in the process, but was able to nuke my like folder and Cloudinary and then just do it again. And there’s a little bit of trial and error there, but uploading it [00:19:00] first was like, I could test it. I wasn’t actually changing anything on the website.
So that’s no harm, no foul. So step one was just like taking the images from where we had them and getting them into Cloudinary in a way that allowed us to do our migration across the website and all of our content. So the second step was migrating the image URLs in our content. So once our, once we have like the media library and Cloudinary’s DAM, their digital asset M is for what?
Sam Brace: Management. Management.
Grant Sander: Yeah. Okay. DAM Library, whatever we’re calling that thing. Once it’s up there, we need a way to, like all of our content was in markdown ,files. We were, we had a very markdown driven site. We needed to go into all those markdown files and change all of the image paths from like a relative path because everything was in the same like repository served outta the same bucket.
We were using relative image paths for all of our image tags, and we needed to update those to be like, fully qualified URLs pointing to Cloudinary, [00:20:00] and have it, point to the right image. So this was the hard part was like getting that migration and automating that out so we didn’t have to do a bunch of manual work.
So writing the migration script for that.
Sam Brace: So talk to me about that, because I agree with what you said. That sounds challenging. So how, what were some of the steps that you remember having to take when you were talking about that migration script to make sure that everything was going from the relative to the situation that you have now?
Grant Sander: Yeah, we could probably jump into some code if you’d like. That might be the easiest way. Maybe the easiest way to do this is just jump right into some code. Do you wanna start there or do we wanna start with uploading the library to Cloudinary?
Sam Brace: Actually, yeah, if you wanna take a step back and talk about the uploading part, that’d be awesome.
Grant Sander: Yeah.
Sam Brace: And then we can jump back and talk about testing once we go through the first two steps. Yeah.
So we have a lot of like configuration. We had some baggage from our site where like over the years we had changed how things were structured and so the places where our images lived had changed over time. So there were some like weird edge cases that I’ll point out here, but I don’t want us to get bogged down too much in those details.
So we configured our Cloudinary SDK just, I omitted like our cloud name, API Key, API Secret. I also only ever ran those as a process arg, so you had to pass those into a CLI command so that I never accidentally commit those into Git, [00:22:00] even though Git respository is private. But, good practice not to commit your API secret into your repository.
So you know what’s been [00:23:00] uploaded and what hasn’t been uploaded. So if you’re uploading 1500 images, keep track of the ones that have been successfully uploaded so you don’t have to redo those in case like something falls over or fails. Learn those from field combat of not doing that in the past.
So this is just like a status map where I just like literally had a JSON file that I would write to.
Sam Brace: Okay.
Grant Sander: That, so as images were being processed, you just create a record in a JSON file that’s just like the image name and then whether or not it succeeded or not. And then when you start the script up, we use that to check like which ones are already uploaded and which ones aren’t.
Not super important, but a fun little call out of if you’re doing a long running process, probably good to track your progress as you go in case you have to restart that thing.
Sam Brace: Smart.
Jen Brissman: Totally.
Grant Sander: So this is, this looks pretty ugly. It’s not super pretty code, but what I wanted to do was grab all the source images from [00:24:00] our repository and we, like I said, we had images living in different folders.
And so I have this configured in like a configuration file, but this asset roots is like just an array of like paths to the folders where images lived. So using this library Glob, which is a super cool library, I find myself using Glob all the time. There’s a lot of libraries in the space for globing stuff, but basically allows you to point to a folder and then just write like a glob statement like this where it says, “Hey, go ahead and search through everything nested under there and find me all of the files with these extensions.”
And so this is a pretty easy way to go find all those images. PNGs, JPEGs, SVGs, GIFs, the whole kit and caboodle there.
Sam Brace: Yeah, it definitely comprises everything that you would’ve had there prior to moving to Cloudinary. That makes sense.
Grant Sander: Yeah. And then once I have found all those from the folders, just flatten that out into a flat array.
And then there’s some [00:25:00] extra stuff here of this was just like cleaning up some stuff we didn’t want, we wanted to filter out the WebP generated files, which had a specific, like prefix on ’em. Also clean up the path a little bit, just removing like the prefix, so it doesn’t have like user: “Grant Sander – home”, GitHub, whatever.
Jen Brissman: Right
Grant Sander: Just removing the stuff from the beginning. And then this is the part where like checking against the status map where it’s okay, if it’s already been uploaded, don’t do it again. So we are only doing this on the images that are marked as needs upload. So we have all of our image files that we wanna upload.
That’s not too bad. And then what I’m doing is I’m gonna skip a couple things that aren’t super important. What I did was ended up like batching the upload. Cloudinary does have some rate limiting in place. And I tried to like upload 50 images at a time [00:26:00] and like I hit some rate limiting issues and I was like, okay, let’s just, I have time.
I can run this and go eat dinner, so I just batched it at size two. But you can control that. And so what I’m doing is uploading two images at a time. And so this is there’s some logic here for doing, like batching up some stuff, but at the end of the day to upload an image, it’s like literally just cloudinary.uploader and then there’s an upload method and you just pass in the path of the thing.
Sam Brace: Yep.
Grant Sander: And that’s pretty, pretty darn simple. So there’s some complex stuff here, but it’s only because the complexity of the repo that I was working in and like finding the images, but to upload like a single image, you just pass it, pass it a file path and give it a public ID.
One of the things we had to look out for was I had to take the path of an image and give it, like I had to follow some logic to give it a public ID which is how you’re gonna specify which image to like load. That’s like the key identifier. I had to make things [00:27:00] deterministic. And so there’s some cases where we had some stuff where I had to rename a couple files because like we had a, say like gen.JPG and gen.SVG like the, it’s the same file name, different extension.
And that was causing some conflicts when I was generating public IDs. So I had to do a little bit of cleanup around that. But generating out like a deterministic public ID based on the image path. And we’ll use that. You’ll see that again in the next step of migrating things. I also had a folder all these got uploaded into the same folder in Cloudinary.
Super easy just to specify that. And then I set the overwrite to true just in case it’s already there. I could just re-upload it in case.
Sam Brace: Yeah, and I actually, when I was looking at that little portion of it right there, line 70 about to 80 or 82 is, that was smart. The override situation, cause you don’t want any of your batch processing to get hung up because it’s saying it can’t override an existing file.
In case of what you talked about, where we do wanna override a file that’s been like sample.JPG. Okay, we do need to override that [00:28:00] from what’s there. So that was smart to make sure the batch continued and was processed completely.
Grant Sander: Yeah. And then status map once the image uploads, you set the thing on the status map.
And then I just, again, this is like not super scientific, but it works good enough. Every 10 iterations, I’m just writing out this JSON file. So it’s like hitting “Command + S” every time you upload 10 batches.
Jen Brissman: Totally.
Grant Sander: Yeah, again, like that, it’s good enough to save yourself some time in case something goes wrong in the middle of it.
Sam Brace: It sounds like we’ve all been burned somehow without hitting “Command + S” at some point. So putting that logic into what you’re doing here, it makes sense. So I like this a lot and frankly, in my opinion, if we’ve covered nothing else in this episode, and obviously we’re gonna cover more, but if we’re covering nothing else, to be showing a clear migration path from something to Cloudinary, which you’ve provided here, is amazingly useful for the million plus developers that are using Cloudinary. So this is fantastic information.
Jen Brissman: Yeah, even including [00:29:00] overwrite: True, and everything we just talked about, like line 70 all the way through your check to periodically save your progress. Like anyone watching, this is open source. You could probably take this or make a similar version of this.
I totally agree, Sam. If people only get this, it’s probably valuable.
Sam Brace: Absolutely. Absolutely. So Grant, so we’ve covered the point of the migration efforts. Is there anything else not the migration efforts, but the uploading, the library of the Cloudinary anything else that’s important to mention about this process?
Grant Sander: I think we covered most of it. The upload process, like the Cloudinary, SDK makes it just very easy. I think the big part was like, make sure your public ID- you’ve gotta keep your eyes on the public ID because that’s the thing that you’re gonna end up using. And so at some point you’re gonna end up using that public ID again.
So making sure that like you have some logic for generating, like deterministically generating those so that when you go to migrate your content, you can give the proper public [00:30:00] ID. So I think that’s I think this get cloud name is somewhere else in this gist. The details aren’t really that interesting.
It’s based on whatever your own setup is, but I think the public ID is another one just to keep your eyes on. But yeah, other than that, I think that kind of covers the gist of the upload process.
Sam Brace: Yeah. And I, and it is something that we push a lot in training where when we are talking to customers, to think about naming conventions for their files or their assets, and public ID is how we do that.
And I think your example was perfect where gen.JPG versus gen.SVG technically have the same public ID with the way that Cloudinary looks at it because file extension is not considered as part of that. Unless we’re talking about a widely different format. If it’s gen.MP4 versus gen.JPG, yeah.
We might be seen as different situations. So it is important for them to people thinking about how do I want my files to be named once they are migrated to Cloudinary? Do we want ’em to stay the same as local? Do we want [00:31:00] some form of randomization to prevent overwriting? There’s lots of ideas around that. So I think that was good for you to focus on that.
Grant Sander: Nice. Cool. So then, yeah, I …
Sam Brace: We’re moving to migration, right? So that’s step two of the four part process?
Grant Sander: Yeah. Yeah. Let’s talk migration, which I will go ahead and find that file. Let’s see.
Jen Brissman: Yeah, Grant, while you find that, I just wanted to air your tweet, which I, which really made me laugh where you said “Cloudinary is boss”.
When you tweeted about this blog, I really got a kick outta that. And I know that probably, as you said, the upload part is relatively simple and straightforward, and I’m looking forward to getting to the code where we talk about optimizations and where, what probably led you to saying Cloudinary is boss.
Grant Sander: Yeah, that sounds great. We’ll we’ll turn through this the migration part here to get into some more interesting stuff. Again, a lot of this [00:32:00] has details around like the specifics of how our repository was structured. So we’ll just touch on some of the like higher points, some high level points.
So again, we’re gonna, we’re gonna do some “globbin'” in here, which is great.
Sam Brace: Great.
Grant Sander: So a similar thing here of grabbing all of our markdown files. So you can see here we’re going in and we’re saying, “Hey, inside of the content folder, this is where all of our content files live. Give me all the markdown files.”
So now we’ve got all the files we want to inspect, and this is the primary place where we wanna update URLs. So like just looping through the files. So four constant files, just loop through one at a time and doing a Node file system, read file, and just reading out that file. So this is like taking a markdown file, just loading it into memory UTF-8 encoding, and so we can look at that and work on it like it’s a string. So at this point, file contents is like literally just a string, which is like the markdown content that’s in there, which is [00:33:00] nice because if you have a string you can do whatever sort of shenanigans, like string shenanigans, you want.
So I am a sucker for a good regular expression. And we have a nasty one here. I don’t, I think maybe all regular expressions look a little bit nasty.
Sam Brace: I agree, I definitely agree with that too.
Grant Sander: So, we won’t go too far into this, but you can pick out a couple things here where we’re looking for things that end with PNG, JPG, SVG, GIF, WebP. And they also start with, I mentioned the asset route before. Which is like the folders in which images would live in. So it’s going and saying, okay, if we see any, things that start with the forward slash and they have the asset folder name to start, and then they end in like a .JPG ,that’s gonna be a thing that’s referencing, it’s like a relative URL to an image that’s hosted in our repository. So we can use [00:34:00] that “r” this is not a great name, but I’m scripting and so I’ll give myself a little bit of grace here. Using that regular expression, and with a regular expression, you can pass that into the string replace method and you can actually do logic on these things.
So this gives us the match itself and there’s a lot of like regular expression stuff going on here, but what we’re doing is basically generating the Cloudinary ID. This is what the C ID is, and it’s using the GET cloud name that we saw before. And this is like the thing that like took a file path and turned it into a, deterministically, turned it into an ID for Cloudinary to use.
And here we are using it again because now we’re looking at the file path and we gotta say, all right, we need that file, or we need that Cloudinary ID to use as part of this URL. So Cloudinary has their own URL structure for requesting images. So we have, [00:35:00] we’re gonna use our project ID, image upload, I set a version number on these, which we could get into some details on there. Passing in a folder and then using the Cloudinary ID. So this is how you walk through each markdown file, you use that same function that generated the ID, and you just use that as the last part of this Cloudinary URL.
And then you just do a little swaperoo. You just swap that relative path for a Cloudinary URL. And you write that thing right back into the markdown file that it came from.
Jen Brissman: Question for you, Grant. Oh, go ahead Sam.
Sam Brace: No. Go ahead. You had a great question.
Jen Brissman: I was just gonna say in line 42, is that an exhaustive list of file extensions that it could be, or do you have a fallback in case there is, something hanging out in there that, that isn’t on that list?
Grant Sander: Yeah, this for our case, this was exhaustive. We didn’t have anything outside of PNGs JPGs, SVGs, GIFs, [00:36:00] WebPs. Theoretically it’s not exhaustive. There’s other image formats that could have been in there. And so for our case it was good enough. But like you would probably, like if you were using, what are some other image formats?
Jen Brissman: AVIFs…
Grant Sander: Other some other images. You might also wanna include those in there. But we had, just the vanilla image format, so it was a little easier for us. But yeah, this, you would wanna update this to cover all of your image extensions.
Jen Brissman: And I guess you wrote the logic anyway, that if something were to break, it would stop and have saved that batch. So there you go.
Grant Sander: Yeah. Yeah. And we’ll, the next script is actually like validating that images didn’t break. And so I had a little bit of a fallback in case, we had an oopsie.
Jen Brissman: Yeah. Cool.
Sam Brace: A couple I’m ask you about here and I think you’re highlighting it, so yeah, we’re saying the same thing or Grant, but yeah. So talk to me about your decision to add f_auto and q_auto to this overall process. Cause I, [00:37:00] obviously, I know what f_auto, q_auto is. You know. Jen knows. But talk to me about your decision making on what and why you decided to add those transformations.
Grant Sander: Yeah. So with Cloudinary’s image URL transformation syntax, you can pass in these transformation parameters that transform the image.
So if you left out this part here, we deleted this part out, Cloudinary is just gonna serve you back the original image asset. And that’s okay, but we mentioned earlier that a lot of times serving up an original image is not the most optimized, it’s almost never the most optimized solution.
Sam Brace: Correct.
Grant Sander: Cloudinary, this is one of the things that I think makes Cloudinary so powerful, and makes this sort of thing low hanging fruit, is this f_auto parameter is auto format, which says, “Hey, your browser requests the image, Cloudinary looks at information about the requesting browser, says what’s the best image format to use for this browser?”
So if it’s Chrome, it’s probably gonna be like an AVIF. If you are [00:38:00] like, for some reason on Internet Explorer 11 or something like that, it’s gonna choose something else that’s not a next gen because you’re using a 20 year old, 10 year old, browser. And it just serves the best one. So this is, you don’t have to think about it.
Cloudinary’s magic just serves the best format for you. So that’s nice because now you don’t have to specify and you don’t even really have to care too much about the original image extension. You just say, “Hey, Cloudinary, gimme the best one. Whatever. We trust you to give us the best one.”
Sam Brace: And then q_auto, to double down on that real fast, like when we were talking earlier about your original format situation where you’re converting everything from JPGs to WebPs, I think your Internet Explorer one is perfect because as we know, Internet Explorer and WebP don’t really get along because of WebP was becoming more of a format as Internet Explorer was starting to get phased out, and Edge was becoming a thing for Microsoft.
So it does help for those backward looks as well as those forward [00:39:00] looks. So I think it was really smart to make sure that you’re looking at it from that sense. And yes, you’re looking at exactly what I was gonna recommend is, caniuse.com. This is always a great way to see the latest and greatest and what types of browsers can serve certain formats.
And yeah, as you can see here, like older versions of Chrome, older versions of Firefox, they can’t handle AVIFs. If we pull up WebP would be almost identical Yeah. In some of those cases too. So it’s nice that you have something that can constantly format to every user’s various browser choices, cause probably there are still some IE users out there that are using IE 6, in my opinion, which is crazy, but it’s true.
Grant Sander: Yeah. Yeah. And even I think even, AVIF, as an example, has pretty good support, but Edge doesn’t support it, which is interesting to me. I would’ve thought they would, I believe Edge does support for WebP and and it has for quite a while, actually, since very early on.
So the nice thing there is okay, AVIF is great. It’s probably better than WebP in a lot of [00:40:00] scenarios. It’s gonna use that most of the time. But like Edge users, we can’t forget about them. Edge is a good browser, so let’s serve them WebP, which is pretty good. And then fallback to something else if you’re for some reason running IE 11.
Sam Brace: Yeah. And if you are, I strongly suggest not, go ahead after your browser like I said, all strokes for all different types of people. So it all makes sense. So it’s already to jump out of the f_auto, q_auto situation, but I love the choices there.
But what about q_auto?
Grant Sander: Yeah, so f_auto serves you the best format, so like image format, which is great alone for bandwidth because a lot of these next gen formats have better compression ratios. And so you can send smaller images, which is really good. But the q_auto is auto quality, which allows, basically it allows Cloudinary to choose the right compression.
And so if you take an original JPG, you can say, all right, let’s save this as 70% quality so we can shrink this file size quite a bit. So there’s always like a [00:41:00] balance between file size and image fidelity. That’s like the trade-off here is okay full quality, you have the highest level of fidelity, but you have the largest file possible.
And so like moving the needle on that lever a little bit, it’s like a balance of what’s the best thing to serve here. So q_auto just allows, again, Cloudinary is magic to pick where to put that needle for you. You don’t have to do it. It’s just what’s the best ratio of fidelity to compression ratio here and file size.
And so again, this is like an easy way just to use the, the thing that Cloudinary does really well is like doing that for you. And just not letting not burdening our content authors with making that choice. Like they don’t have to worry about it. It’s just gonna be a good one.
Fidelity’s gonna be good enough. It’s gonna reduce image bandwidth as much as you can without [00:42:00] making images look super distorted. So I see f_auto and q_auto as going hand in a lot of ways of just like a good default for not really thinking a whole lot about image stuff, like to, to a large degree.
You can just forget about the nitty gritty of the image delivery and format and stuff and just be like, yo, I want a really cool image of this thing. And then Cloudinary handles like serving it well.
Sam Brace: Oh my gosh, you said it almost words that I always say when I talk to peopleabout f_auto q_auto. I always talk about it and in terms of set it and forget it, like it’s where it’s just let it do it’s magic and handle it.
Cause you’re always going to make sure that you’re getting a 200 when you’re serving something. You’re always getting it where yeah, the image is gonna come through. You’re not gonna be like, As you were pointing out, like if I tried to serve an AVIF today to a Edge user, I could potentially get a 404 or not be able to display content, some type of something that we don’t want.
This guarantees, deliverability, it [00:43:00] delivers it at the right bandwidth, the right compression, the right format every time. So yeah, set it, forget it the way you put it. Just set it and leave it whatever people want, but it works nicely in that case.
Jen Brissman: I was gonna say line 31 is why I would venture to say Grant tweeted “Cloudinary is boss”, because f_auto q_auto – we’ll say it till the cows come home, that is Cloudinary’s special sauce, our magic, and I’m glad that we all agree and if anyone’s just listening and not watching, they wouldn’t see the big huge smiles on my face and Sam’s face as Grant was saying all of that before, because it’s like Sam said, that’s basically like a script that we say in training is everything Grant said. So he really couldn’t have said it better, Grant.
Sam Brace: We didn’t coach you ahead of time, grant, so you’re
Jen Brissman: Yeah. Seriously no coaching.
Sam Brace: But, one thing I wanna ask you about, cause if this is interesting to me, so line 32 where you’re declaring the version number here, what was your decision making there?
Because I can see it’s a very specific version number. What [00:44:00] was your choice there? Because that’s something that I haven’t seen a lot in people’s code and it doesn’t mean it’s wrong. I’m just interested in what you were doing there.
Grant Sander: Yeah. I’m not gonna make a strong argument that you should do that. Honestly I think it was partly coincidental from grabbing one of the images that I uploaded.
I’m guessing that’s where that version number came from, was like the first image I uploaded and grabbed the URL for. There’s also another reason that I’ve, I left that in there was because for regular expression reasons, the fact that you can have folders, like you can have like a nested folder structure of your Cloudinary id, and you can nest things with slashes.
It can be. A little hard to figure out where your like parameters start, like your Cloudinary parameters. So like after upload you, you add in your Cloudinary parameters.
Sam Brace: Yeah.
Grant Sander: And then you can add your version number in there and then put in your like folder and path to the image. Without a very specific, like v [00:45:00] with a bunch of numbers, it can be hard to figure out from a regular expression standpoint where to inject in the parameters.
And so like we I set up some automation to take any Cloudinary URL and check to see if it had f_auto and q_auto. And if it didn’t, it would automatically inject it in there. So if people were like picking a Cloudinary URL, it would just be like we would, we had a GitHub action script set up and it would automatically add ’em in there.
And it was hard to do some of the automation without the version number there. I may have been able to do it, but, yeah, I like it.
Sam Brace: It makes sense. It makes sense.
Jen Brissman: That’s the first time that I’ve ever heard of someone using a version number, sort of as like a pipe or a place to look for. So like in your code, did you have it look for “/v” or something? Or, how did you have it?
Grant Sander: Yeah, so I guess I don’t have the code pulled open right now, but if we opened the console, like if you were to do something like you look [00:46:00] for something like “v” and then you could do like digit, multiple digits, and then you could test something like this.
I don’t know if you can see it. Let me blow that up a little more.
Jen Brissman: Yeah.
Sam Brace: Absolutely.
Grant Sander: That doesn’t pass, but if you have something like “v123”, it’ll pass that. So it, it was just a way to do like regular expression stuff, to figure out where things go. And there, again, I’m not saying, I’m not fully sold, that there’s no way to do this without the version number.
It was just the path of least resistance for me. A lot of these older images, we didn’t need to do a ton of versioning on them. And so at some point, leaving the version number out is nice because then. Cloudinary just uses the most up to date one. And yeah I would’ve to think more about I’m not a hundred percent convinced I would advise people to leave the version number in there.
It’s just something I did out of the migration process and making life a little easier. [00:47:00]
Sam Brace: And the way I look at it, Grant, is this is once again showing a real world scenario, and it worked perfectly for you. So it is to say if someone ever had issues when they’re trying to migrate things over under using regular expressions or regex, then this is a reasonable outcome in my opinion. So this is good. So now that I’ve seen this, so I feel like we’ve accomplished two steps now, but the third step, and we didn’t dive into it deeply, cause of course I backtracked this and let’s talk about code. But in the same sense, the third step was testing this migration. How did this work? So talk to me about the overall testing process.
Grant Sander: Yeah, let’s, we’re gonna run outta time at some point, so we’ll go through this one pretty quick because I think the fourth step is maybe a little more interesting. But the migration validation was pretty cool. So it was like, I wanted to make sure I didn’t totally botch things, which is good.
It’s good to validate, it’s like writing a test for your migration, which is nice. And what I did was basically built our [00:48:00] static site, which then dumped the entire built site out into a public folder, is the name of the folder, but it’s basically just like building your static site and it’s just dumping out all the HTML files for your static site.
Got it. So what I did was, again, it did some globbin’ and grabbed all of the HTML files out of the built site and another status map, because this was a process that took a little while. So saving my progress as I go. But what I did was loop through each HTML file from the build output. And I use this thing called, let’s see, Broken Link Checker is the actual name of this library, and Broken Link Checker has, I have it pulled open here.
It does a handful of things. It basically allows you to find broken links, missing images, et cetera. It’s whoa, that’s what I need to do. I need to make sure I don’t have any missing images in here. So what I did was, [00:49:00] I used this library. I just read each HTML file from the build output, and just configured this HTML checker thing to look for images.
Sam Brace: Yeah. And excellent. I like the “all good, no bueno” captures there. Fantastic work there too. Okay. So I understand the testing process and honestly, once again, this is really good playbook material for anybody that’s going through migrations when they’re going through this overall process. But then we’re now to the final step, the big one, right?
Which is where everything is live and we’re able to start going through and seeing results of this overall process. So talk to me about that.
Grant Sander: Yeah, so at this point we’ve got all our content migrated, we’ve got our image library up in the cloud. We’re ready to go. We’re ready to merge the thing and ship it live.
One of the things I did before that was just do a quick check. Like I knew we were gonna be saving bandwidth. Like it’s pretty, pretty obvious. That’s a strong word, but it was somewhat obvious that, that was gonna be the case. I didn’t have any doubt. I [00:51:00] wanted to figure out “okay, how much actual savings are we gonna get?”
Cause I did a spot check of a couple pages and I was just like this seems like quite a bit. Are we really that image heavy? Were we really overserving that much? So what I did was wrote a script. Let me go find this. To basically automate, the way I checked it was I would open a webpage, open my Chrome dev tools, and so let’s go to formidable.com, open my dev tools and go to the network panel.
So what I did was just did a quick spot check on like our homepage, a couple other important pages we had up for the time. [00:52:00] And I just did a little bit of napkin math and computed, we were having savings around 75 to 80%, which I was like, maybe I should just check a bunch of pages.
But I don’t, I didn’t wanna do it manually and I wanted to be able to tweak some things and make sure things were right. So what I did was just write a script, and again, I gotta go find this, wrote a little script to basically automate that exact thing that I did using a tool called Puppeteer. So Puppeteer, just for those who don’t know, Puppeteer is a pretty cool tool.
I believe it’s out of Google.
Sam Brace: Yes.
Grant Sander: And it’s a Node library, which provides high level API control to chrome / chromium over the dev tools protocol. So basically what it allows you to do is create and automate, like controlling a Google Chrome instance. It’s basically what it is. So you can spin it up, point it to webpages.
It’s almost like a little robot driving your little chrome browser. And have a lot of APIs for doing different things. So [00:53:00] you can do a lot of pretty cool stuff with Puppeteer.
Sam Brace: Yeah, and we love puppeteer over here. It’s interesting this, you’re not the first to mention this on even a DevJams episode because we had it where people were taking screenshots of code and using that for puppeteer purposes.
We had a developer that was developing open graph images for his blog posts and doing that with some Puppeteer aspects. Fabulous tool. Can’t recommend it enough for things like this. Once again, good word, Grant, using awesome tools. So good job there.
Grant Sander: Yeah. Another interesting one that there’s better tools in the space now, but for a long time, like generating PDFs was a pain.
And so I remember like early days of using Puppeteer, you’d build a webpage that was like 8.5 inches by 11 inches. Point Puppeteer to it, do a basically simulating a command P to print the thing out and generating out like a PDF from the page. And it was like a really good way to use like web tooling.
And it was like you were designing a PDF using all the web tools that you knew, and then you just used Puppeteer to automate [00:54:00] generating those. So dynamically generated like receipts or like inventory lists, things like this. You could use Puppeteer. So a lot of really cool use cases with Puppeteer.
Sam Brace: For sure.
Grant Sander: So with what we did here for measuring images, this was tough. This was a tougher use of Puppeteer than what I’ve done in the past, but there’s some setup around getting Puppeteer up and running. But basically what we do is: launch Puppeteer. I did headless false, which shows like the actual view.
So you can see it like scrolling through pages and stuff, which is cool. I created a single page within the browser instance and just set like a viewport with- the viewport size isn’t really super important on this case- but there are some, there’s a way for you to actually create a CDPSession, which gives you like access to a dev tools instance.
So this gives you a way to like programmatically access chrome dev tools from the like robot run, [00:55:00] Puppeteer instance that you have.
Sam Brace: Okay.
Grant Sander: And then you can send signals. And this is, I found a lot of this like from guides on the internet, things like that. But one of the things you do is you enable the network functionalities of dev tools by sending it this network enable symbol.
And then there’s a lot of code here and, we’re gonna run out of time, so I don’t wanna go through all the details of everything. But one of the things that basically I did was you could tap into a couple events. And so network response received was an important one where we listened for image responses.
And so image or binary octet-stream as the content type. And then you store those because we also have another event. There’s some like coordinating that we have to do here. I won’t go all the way into the details of this, but when a loading finished, you see if it was one of those images that we caught earlier and we go and say, okay what’s the image count?
We’ll just keep track of that. And then I [00:56:00] don’t, I’m not convinced this is the perfect way to do this, but a decent indicator of the image size is the encoded data length property. And this matches up pretty well with what Chrome gives you as well, but. My impression is this this might not be perfect, but it’s pretty darn close.
So it is like good enough for us to measure. It matched up from the pages that I spot checked, it matched up like perfectly with what I saw. So for me it was good enough and it’s a consistent measure across control and treatment groups. So my opinion, it was good enough. So then I basically just had- there’s some math here.
We won’t, we don’t need to look too much at the, like dividing some stuff, but basically, it is pretty standard comparison like percentage stuff. But I have a control URL, which is was at the time our production site and then a staging URL where the new Cloudinary version lived.
And so I have a control group and a treatment group. And for each URL that I wanted to test, I would point Puppeteer to [00:57:00] the URL for the production one. And then for the staging one, measure all the stuff. I did a scroll, too, to basically scroll down the entire page to get the lazy loaded images to load.
So that’s, pretty common for images to load lazy. Wait for the network to idle so that you can capture all of the like loading, finished events to make sure like all the images load in, and then just return back the like total, which was like the total number of like encoded bytes. And then I wrote all that into a JSON file, did the percent calculations, things like that.
And the results I know we’re, we’ve just got a few minutes, but the results were pretty, pretty amazing from, just from like our homepage, for example, 27 images, percent saving was 48%. We had a lot of these that were like 86 to 90%. All of our image heavy pages. We have case studies that have these beautiful graphics and these ones we saved a lot.
A lot of these, the average across the 30 pages was a about 80. It was a [00:58:00] little over 86% bandwidth saving, which was…
Jen Brissman: wow.
Grant Sander: Quite a bit more than I was expecting. It was a surprising amount.
Jen Brissman: Yeah. Amazing.
Sam Brace: One thing that I wanted to ask you about, because now that we’ve seen the overall process and, it’s great, because as I’ve said a few times now, this is giving people a real world view of how they could actually accomplish a lot of these tasks with the tools that you’ve gone and shown.
Of course, they may not have your expertise, and so they needed some help to be able to go down this path, but you’ve given it to them. But to wrap this all up, so as we have talked about the very beginning of this conversation, Formidable is doing a lot of development and design work for clients for large companies.
You also said small startups, but people that need design and development help. With the learnings that you took through this project to help out the Formidable site, how are you now applying some of that to your client base?
Grant Sander: Yeah, I think one of the big things is we’re very eager to [00:59:00] recommend services like Cloudinary early, like upfront in the development process, so that you can just avoid having to migrate at all. Because migration comes at a cost, like you have to do the work. And so one of the things is just here’s an example of how much you can save by using a service like Cloudinary. Just use it upfront. If you know that you’re gonna have a lot of images, user-generated images, a lot of content, visual heavy content, just start with Cloudinary upfront.
So I think that’s our recommendation to our clients now, is like you said, best of breed. Choose the thing that’s gonna get the job done the best and this is a good showcase of that.
Sam Brace: And I would also even say, I appreciate you recommending Cloudinary, but it’s also, I think even in strong message, that’s not as about us, is just you really do need to think about your images and videos and how they’re gonna be displayed, how they’re gonna load.
It’s a very important part to how users experience these websites. So I think the fact that you’re addressing Cloudinary at the beginning is also stating to [01:00:00] them, you need to have a strategy when it comes to how you want your images and videos to be loading and how you want the user to receive those and be actionable with that information.
So I love the fact that you’re making your clients think about media more, which is great.
Grant Sander: Yeah, absolutely.
Sam Brace: Jen, final thoughts here. What do you think, is there anything that’s burning in your brain here about what me and Grant and you have covered here today?
Jen Brissman: Yeah, the biggest thing is we’ve talked a lot about images today. And especially with client work, my brain was already going to “okay. What about other types of media?” “Is Grant, are any of his clients asking about 3D or audio or video?” We’ve really focused on images today, but Grant, do you think this would apply to everything?
Grant Sander: Yeah, I think video is at least as complex as image is and we have, for example, our Puma team has been doing some exploration with using Cloudinary’s programmatic video generation and delivery. And so we’ve been doing some [01:01:00] exploration on the video front where we have, either like content author driven videos or even user generated videos, and figuring out how to like programmatically handle those in a efficient way. So we have been looking. We haven’t done a ton in the 3D space, but video is definitely something that’s on our radar and especially with like video becoming a pretty popular medium right now.
A lot of places are moving into the video space and allowing for user generated video stuff and so definitely on our radar.
Jen Brissman: Absolutely. Yeah. Video and mobile and everything like that. And I actually find that Cloudinary is incredibly powerful when it comes to q_auto with video. The results are pretty mind boggling in my opinion.
Sam Brace: Exciting. Thank you so much for being here. This was fantastic. We appreciate all of your time, we appreciate your insights and obviously you’re doing some great work over at Formidable, as well as the entire Formidable team, so keep it up. This [01:02:00] is fantastic work.
Grant Sander: Yeah, thanks for having me. It was a blast.
Sam Brace: Excellent. And so of course, Jen, as we have pointed out right at the very beginning, but it’s worth stating again for everybody here, that all of these episodes, if you’ve enjoyed this one or wanna see what else we’ve talked about, because we’ve mentioned with some of these technologies that Grant covered, we’ve covered them in other episodes as well.
You can always visit all of those at cloudinary.com/podcasts and dive into all of the various episodes of podcasts that Cloudinary produces on that site. And of course, if you have other things that you wanna discuss around what Grant has covered here, the best place to do that is the Cloudinary Community.
That’s gonna be at community.cloudinary.com. And you can see that we have forums for people that wanna have more of a threaded conversation, and ways to keep track of things a little bit easier. But also we do have a Discord server for those that are Discord users, and to have that as a real-time chat option as well for the overall community.
So absolutely make sure that you [01:03:00] guys are diving into all of that content. And as mentioned before, everything that we covered in this episode is also covered in depth in Grants blog post, which is found on formidable.com, which we do recommend that you read for more deep details of everything that was in this overall episode.
So Jen, before we let everybody go, any final thoughts before we say goodbye to our friends and family here at DevJams?
Jen Brissman: Well, lots of thoughts, and that was a really interesting episode for sure. But one of the main takeaways for me is how creative you can be with the givens. So for instance, just something like version number, it’s not something, all of our customers use or that would apply to every use case.
And Grant was using it in a unique way and it just really goes to show you can be creative with what you’re given and yeah I wonder if anyone watching might do something like that now after, after seeing what Grant did.
Sam Brace: I agree, I agree. Absolutely. And amazing takeaway, Jen. So on behalf of everybody at Cloudinary and of course those that were involved with the overall [01:04:00] program in this development, thank you for being part of this DevJams episode, and we are excited to see you at the future ones that we do produce.
Take care and we’ll see you soon.