Serving images is a critical part of Etsy’s infrastructure. Our sellers depend on displaying clean, visually appealing pictures to showcase their products. Buyers can learn about an item by reading its description, but there’s a tangibility to seeing a picture of that custom necklace, perfectly-styled shirt, or one-of-a-kind handbag you’ve found.
Our image infrastructure at its core is simple. Users will upload images, we convert them to jpegs on dedicated processing machines (ImgWriters) and store them in large datastores (Mediastores). Upon request, we have specialized hosts (ImgCaches) that fetch the images from the Mediastores and send them to our users.
Of course, there are many smaller pieces that make up our Photos infrastructure. For example, there’s load balancing involved with our ImgWriters to spread requests across the cluster, and our ImgCaches have a level of local in-memory caching for images that are frequently requested, coupled with a hashing scheme to reduce duplication of work. In front of that, there’s also caching on our CDN side. Behind the writers and caches, there’s some redundancy in our mediastores both for the sake of avoiding single point of failure scenarios and for scale.
Throughout this architecture, though, one of our biggest constraints is the size of the data we pass around. Our CDN’s will only be able to cache so much before they begin to evict images, at which point requests are routed back to our origin. Larger images will require heavier use of our CDN’s and drive up costs. For our users, as the size of images increases so too will the download time for a page, impacting performance and resulting in a suboptimal user experience. Likewise, any post-processing of images internally is more bandwidth we’re using on our networks and increases our storage needs. Smaller images help us scale further and faster.
As we’re using jpegs as our format, an easy way to reduce the data size of our images is to lower the jpeg quality value, compressing the contents of the image. Of course, as we compress further we’ll continue to reduce the visual quality of an image—edges will grow blurry and artifacts will appear. Below is an example of such. The image on the left has a jpeg quality of 92 while that on the right has one of 25. In this extreme case, we can see the picture of a bird on the right begins to show signs of compression – the background becomes “blocky” and there are “artifacts” around edges, such as around the bird and the blockiness of the background behind.
The left with jpeg quality 92, the right jpeg quality 25.
For most of Etsy’s history processing images, we’ve kept to a generally held constant that setting the quality of a jpeg image to 85 is assumed to be a “good balance”. Any larger and we probably won’t see enough of a difference to warrant all those extra bytes. But could we get away with lowering it to 80? Perhaps the golden number is 75? This is all very dependent upon the image a user uploads. A sunset with various gradients may begin to show the negative effects of lowering the jpeg quality before a picture of a ring on a white background. What we’d really like is to be able to determine what is a good jpeg quality to set per image.
To do so, we added an image comparison tool called dssim to our stack. Dssim computes how dissimilar two images are using the SSIM algorithm for estimating human vision. Given two (or more) images, it will compare the structural similarity and produce a value from 0 (exactly the same) to unbounded. Note that while the README states that it only compares two pngs, recent updates to the library have also allowed the use of comparison between jpegs.
With the user’s uploaded image as a base, we can do a binary search on the jpeg quality to find a balance between the visual quality (how good it looks) to the jpeg quality (the value that can help determine compression size). If we’ve assumed that 85 is a good ceiling for images, we can try lowering the quality to 75. Comparing the image at 85 and 75, if the value returned is higher than a known threshold of what is considered “good” for a dssim score, we can raise the jpeg quality back up to half way between, 80. If it’s below a threshold, maybe we can try push the quality a little lower at say 70. Either way, we continue to compare until we get to a happy medium.
This algorithm is adapted from Tobias Baldauf’s work in compressing jpegs. We first tested compression using this bash script before running into some issues with speed. While this would do well for offline asynchronous optimizations of images, we wanted to keep ours synchronous so that users could see the same image on upload as they would when it was displayed to users. We were breaking 15-20 seconds on larger images by shelling out from PHP and so dug in a bit to look for some performance wins.
Part of the slowdown was that cjpeg-dssim was converting from jpeg to png to compare with dssim. Since the time the original bash script was written, though, there had been an update to allow dssim to compare jpegs. We cut off half the time or more by not having to do the conversions each way. We also ported the code to PHP, as we already had the image blob in memory and could avoid some of the calls reading and writing the image to disk each time. We also adjusted to be a bit more conservative with some of the resizing, limiting it to a range of jpeg quality between 85 and 70 to short circuit the comparisons by starting at an image quality of 78 and breaking out if we went above 85 or below 70. As we cut numerous sizes for an image, we could in theory perform this for each resized image, but we rarely saw the jpeg quality value differ and so use the calculated jpeg quality for one size on all cropped and resized images of a given upload.
As you’d expect, we’re adding work to our image processing and so had to take that into account. We believed that the additional upfront cost in time during the upload would pay for itself many times over. The speed savings in downloading images means an improvement in the time to render the important content on a page. We were adding 500ms to our processing time on average and about 800ms for the upper 90th percentile, which seemed within acceptable limits.
This will vary, though, on the dimensions of the image you’re processing. We’re choosing an image we display often, which is 570px in width and a variable height (capped at 1500px). Larger images will take longer to process.
As we strive to Graph All the Things™, we kept track of how the jpeg quality changed from the standard 85 to vary from image to image, settling at roughly 73 on average.
The corresponding graphs for bytes stored, as expected, dropped as well. We saw reductions often ranging from 25% to 30% in the size of images stored, comparing the file sizes of images in April of 2016 with those in April of 2017 on the same day:
Total file size of images in bytes in 1 week in April, 2016 vs 2017
Page load performance
So how does that stack up to page load on Etsy? As an example, we examined the listing page on Etsy, which displays items in several different sizes: a medium size when the page first loads, a large size when zoomed in, and several tiny thumbnails in the carousel as well as up to 8 on the right that show other items from the shop. Here is a breakdown of content downloaded (with some basic variations depending upon the listing fetched):
At Etsy, we frontload our static assets (js, fonts, css) and share it throughout the site, though, so that is often cached when a user reaches the listings page. Removing those three, which total 6,371,718 bytes, we’re left with 1,113,402 bytes, of which roughly 92% is comprised of image data. This means once a user has loaded the site’s assets, the vast majority of their download is then the images we serve on a given page as they navigate to each listing. If we apply our estimate of 23% savings to images, reducing image data to 787,571 and the page data minus assets to 878,153, on a 2G mobile device (roughly 14.4 kbps) it’s the difference in downloading a page in 77.3 seconds vs 60.8 seconds. For areas without 4G service, that’s a big difference! This of course is just one example page and the ratio of image size to page size will vary depending upon the listing.
Images that suffer more
Not all images are created equal. What might work for a handful of images is not going to be universal. We saw that some images in particular have a greater drop in jpeg quality while also suffering visual quality in our testing. Investigating, there are a few cases where this system could use improvements. In particular, jpeg quality typically suffers more when images contain line art or thin lines, text, and gradients, as the lossy compression algorithms for jpegs general don’t play well in these contexts. Likewise, images that have a large background area of a solid color and a small focus of the product would suffer, as the compression would run heavily on large portions of empty space in the image and then over compress the focus of the image. This helped inform our lower threshold of 70 for jpeg quality, at which we decided further compression wasn’t worth the loss in visual quality.
The first uncompressed, the second compressed to jpeg quality 40.
Page load speed isn’t the only win for minifying images, as our infrastructure also sees benefits from reducing the file size of images. CDN’s are a significant piece of our infrastructure, allowing for points of presence in various places in the world to serve static assets closer to the location of request and to offload some of the hits to our origin servers. Small images means we can cache more files before running into eviction limits and reduce the amount we’re spending on CDN’s. Our internal network infrastructure also scales further as we’re not pushing as many bytes around the line.
Availability of the tool and future work
We’re in the process of abstracting the code behind this into an open source tool to plug into your PHP code (or external service). One goal of writing this tool was to make sure that there was flexibility in determining what “good” is, as that can be very subjective to users and their needs. Perhaps you’re working with very large images and are willing to handle more compression. The minimum and maximum jpeg quality values allow you to fine tune the amount of compression that you’re willing to accept. On top of that, the threshold constants we use to narrow in when finding the “sweet spot” with dssim are configurable, giving you leverage to adjust in another direction. In the meantime, following the above guidelines on using a binary search approach combined with dssim should get you started.
Not all of the challenges behind these adjustments are specifically technical. Setting the jpeg quality of an image with most software applications is straightforward, as it is often an additional parameter to include when converting images. Deploying such a change, though, requires extensive testing over many different kinds of images. Knowing your use cases and the spectrum of images that make up your site is critical to such a change. With such a shift, knowing how to roll back, even if it requires a backfill, can also be useful as a safety net. Lastly, there’s merit in knowing that tweaking images is rarely “finished”. Some users like an image that heavily relies on sharpening images, while another might want some soft blending when resizing. Keeping this in mind was important for us to find out how we can continue to improve upon the experience our users have, both in terms of overall page speed and display of their products.