Caching styles. The basics of client caching in clear words and examples. Last-modified, Etag, Expires, Cache-control: max-age and other headers. URL fingerprint

💖 Do you like it? Share the link with your friends

Properly configured caching provides huge performance benefits, saves bandwidth and reduces server costs, but many sites implement caching poorly, creating a race condition that causes interconnected resources to become out of sync.

The vast majority of caching best practices fall into one of two patterns:

Pattern No. 1: immutable content and long max-age cache Cache-Control: max-age=31536000
  • The content of the URL does not change, therefore...
  • The browser or CDN can easily cache the resource for a year
  • Cached content that is younger than the specified max-age can be used without consulting the server

Page: Hey, I need "/script-v1.js" , "/styles-v1.css" and "/cats-v1.jpg" 10:24

Cash: I'm empty, how about you, Server? 10:24

Server: OK, here they are. By the way, Cash, they should be used for a year, no more. 10:25

Cash: Thank you! 10:25

Page: Hurray! 10:25

The next day

Page: Hey, I need "/script-v2 .js" , "/styles-v2 .css" and "/cats-v1.jpg" 08:14

Cash: There is a picture with cats, but not the rest. Server? 08:14

Server: Easy - here's the new CSS & JS. Once again, Cash: their shelf life is no more than a year. 08:15

Cash: Great! 08:15

Page: Thank you! 08:15

Cash: Hmm, I haven't used "/script-v1.js" & "/styles-v1.css" in quite some time. It's time to remove them. 12:32

Using this pattern, you never change the content of a specific URL, you change the URL itself:

Every URL has something that changes along with the content. This could be a version number, a modified date, or a content hash (which is what I chose for my blog).

Most server-side frameworks have tools that allow you to do things like this with ease (in Django I use Manifest​Static​Files​Storage); There are also very small libraries in Node.js that solve the same problems, for example, gulp-rev.

However, this pattern is not suitable for things like articles and blog posts. Their URLs cannot be versioned and their content may change. Seriously, I often have grammatical and punctuation errors and need to be able to quickly update the content.

Pattern #2: mutable content that is always validated on the server Cache-Control: no-cache
  • The content of the URL will change, which means...
  • Any locally cached version cannot be used without specifying the server.

Page: Hey, I need the contents of "/about/" and "/sw.js" 11:32

Cash: I can't help you. Server? 11:32

Server: There are some. Cash, keep them with you, but ask me before using them. 11:33

Cash: Exactly! 11:33

Page: Thank you! 11:33

The next day

Page: Hey, I need the contents of "/about/" and "/sw.js" again 09:46

Cash: Just a minute. Server, are my copies okay? The copy of "/about/" is from Monday, and "/sw.js" is from yesterday. 09:46

Server: "/sw.js" has not changed... 09:47

Cash: Cool. Page, keep "/sw.js" . 09:47

Server: …but I have “/about/” new version. Cash, hold it, but like last time, don't forget to ask me first. 09:47

Cash: Got it! 09:47

Page: Great! 09:47

Note: no-cache does not mean “do not cache”, it means “check” (or revalidate) the cached resource on the server. And no-store tells the browser not to cache at all. Also, must-revalidate does not mean mandatory revalidation, but that the cached resource is used only if it is younger than the specified max-age , and only otherwise it is revalidated. This is how it all started with keywords for caching.

In this pattern, we can add an ETag (version ID of your choice) or a Last-Modified header to the response. The next time the client requests content, an If-None-Match or If-Modified-Since is output, respectively, allowing the server to say “Use what you have, your cache is up to date,” i.e. return an HTTP 304.

If sending ETag / Last-Modified is not possible, the server always sends the entire content.

This pattern always requires network calls, so it's not as good as the first pattern, which can do without network requests.

It is not uncommon that we do not have the infrastructure for the first pattern, but problems with network requests in pattern 2 may also arise. As a result, an intermediate option is used: short max-age and mutable content. This is a bad compromise.

Using max-age with mutable content is generally the wrong choice

And, unfortunately, it is common; Github pages is an example.

Imagine:

  • /article/
  • /styles.css
  • /script.js

With server header:

Cache-Control: must-revalidate, max-age=600

  • URL content changes
  • If the browser has a cached version more recent than 10 minutes, it is used without consulting the server
  • If there is no such cache, a network request is used, if possible with If-Modified-Since or If-None-Match

Page: Hey, I need "/article/", "/script.js" and "/styles.css" 10:21

Cash: I have nothing, like you, Server? 10:21

Server: No problem, here they are. But remember, Cash: they can be used within the next 10 minutes. 10:22

Cash: Yes! 10:22

Page: Thank you! 10:22

Page: Hey, I need "/article/" , "/script.js" and "/styles.css" again 10:28

Cash: Oops, I'm sorry, but I lost "/styles.css", but I have everything else, here you go. Server, can you customize "/styles.css" for me? 10:28

Server: Easy, he's already changed since the last time you took him. You can safely use it for the next 10 minutes. 10:29

Cash: No problem. 10:29

Page: Thank you! But it seems something went wrong! Everything is broken! What is going on? 10:29

This pattern has the right to life during testing, but it breaks everything in a real project and is very difficult to track. In the example above, the server has updated the HTML, CSS and JS, but the page is rendered with the old cached HTML and JS, plus the updated CSS from the server. Version mismatch ruins everything.

Often when we make significant changes to HTML, we change both the CSS to properly reflect the new structure and the JavaScript to keep up with the content and styling. All of these resources are independent, but caching headers cannot express this. As a result, users may find themselves latest version one/two resources and the old version of the rest.

max-age is set relative to the response time, so if all resources are transferred as part of a single address, they will expire at the same time, but there is still a small chance of desynchronization. If you have pages that do not include JavaScript or include other styles, their cache expiration dates will be out of sync. And worse, the browser is constantly pulling content from the cache, not knowing that HTML, CSS, & JS are interdependent, so it can happily pull one thing from the list and forget about everything else. Considering all these factors together, you should understand that the likelihood of mismatched versions is quite high.

For the user, the result may be a broken page layout or other problems. From small glitches to completely unusable content.

Fortunately, users have an emergency exit...

Refreshing the page sometimes helps

If the page is loaded by refresh, browsers always perform server-side revalidation, ignoring max-age . Therefore, if the user has something broken due to max-age , a simple page refresh can fix everything. But, of course, after the spoons are found, sediment will still remain and the attitude towards your site will be somewhat different.

A service worker can extend the life of these bugs

For example, you have a service worker like this:

Const version = "2"; self.addEventListener("install", event => ( event.waitUntil(caches.open(`static-$(version)`) .then(cache => cache.addAll([ "/styles.css", "/script .js" ]))); )); self.addEventListener("activate", event => ( // ...delete old caches... )); self.addEventListener("fetch", event => ( event.respondWith(caches.match(event.request) .then(response => response || fetch(event.request))); ));

This service worker:

  • caches script and styles
  • uses cache if there is a match, otherwise accesses the network

If we change the CSS/JS, we also increase the version number, which triggers an update. However, since addAll accesses the cache first, we can get into a race condition due to max-age and mismatched CSS & JS versions.

Once they are cached, we will have incompatible CSS & JS until the next service worker update - and that is unless we get into a race condition again during the update.

You can skip caching in the service worker:

Self.addEventListener("install", event => ( event.waitUntil(caches.open(`static-$(version)`) .then(cache => cache.addAll([ new Request("/styles.css", ( cache: "no-cache" )), new Request("/script.js", ( cache: "no-cache" )) ]))); ));

Unfortunately, options for caching are not supported in Chrome/Opera and have just been added to the nightly build of Firefox, but you can do it yourself:

Self.addEventListener("install", event => ( event.waitUntil(caches.open(`static-$(version)`) .then(cache => Promise.all([ "/styles.css", "/script .js" ].map(url => ( // cache-bust using a random query string return fetch(`$(url)?$(Math.random())`).then(response => ( // fail on 404, 500 etc if (!response.ok) throw Error("Not ok"); return cache.put(url, response); )) ))))); ));

In this example, I'm resetting the cache using a random number, but you can go further and add a hash of the content when building (this is similar to what sw-precache does). This is a kind of implementation of the first pattern with using JavaScript, but only works with the service worker, not browsers and CDN.

Service workers and HTTP cache work great together, don't make them fight!

As you can see, you can work around the caching bugs in your service worker, but it's better to solve the root of the problem. Correct setting caching not only makes the service worker's job easier, but also helps browsers that don't support service workers (Safari, IE/Edge), and also allows you to get the most out of your CDN.

Proper caching headers can also make updating a service worker much easier.

Const version = "23"; self.addEventListener("install", event => ( event.waitUntil(caches.open(`static-$(version)`) .then(cache => cache.addAll([ "/", "/script-f93bca2c. js", "/styles-a837cb1e.css", "/cats-0e9a2ef4.jpg" ]))); ));

Here I cached the root page with pattern #2 (server-side revalidation) and all other resources with pattern #1 (immutable content). Each service worker update will cause a request to the root page, and all other resources will only be loaded if their URL has changed. This is good because it saves traffic and improves performance, regardless of whether you are upgrading from a previous one or very old version.

There is a significant advantage here over the native implementation, where the entire binary is downloaded even with a small change or causes a complex binary comparison. This way we can update a large web application with a relatively small load.

Service workers work better as an enhancement rather than a temporary crutch, so work with the cache instead of fighting it.

When used carefully, max-age and variable content can be very good

max-age is very often the wrong choice for mutable content, but not always. For example, the original article has a max-age of three minutes. The race condition is not a problem since there are no dependencies on the page using the same caching pattern (CSS, JS & images use pattern #1 - immutable content), everything else does not use this pattern.

This pattern means that I can easily write a popular article, and my CDN (Cloudflare) can take the load off the server, as long as I'm willing to wait three minutes for the updated article to be available to users.

This pattern should be used without fanaticism. If I added a new section to an article, and linked to it from another article, I created a dependency that must be resolved. The user can click on the link and get a copy of the article without the desired section. If I want to avoid this, I should refresh the article, delete the cached version of the article from Cloudflare, wait three minutes, and only then add the link to another article. Yes, this pattern requires caution.

When used correctly, caching provides significant performance improvements and bandwidth savings. Serve immutable content if you can easily change the URL, or use server-side revalidation. Mix max-age and mutable content if you're brave enough and confident that your content doesn't have dependencies that could get out of sync.

When making changes to websites, we often encounter the fact that the contents of pages, css files and scripts (.js) are cached by the browser and remain unchanged for quite a long time. This leads to the fact that in order for the changes made to be reflected in all browsers, it is necessary to accustom clients to complex combinations of F5 or Ctrl + F5. And from time to time make sure that they are pressed.

The process is quite tedious and inconvenient. You can, of course, get out of the situation by renaming the files each time, but again it’s inconvenient.

However, there is a way that will allow us to remain with the same names and reset the caching of .css or .js files at the moment when we need it. And forget about Ctrl + F5 forever.

The bottom line is that we will attach a pseudo-parameter to our .css or .js files at the end, which we will change from time to time, thereby resetting the cache in the browser.

Thus, the entry in source code will now look like this:

Where 186485 is an arbitrary combination that will output the same file, but the browser interprets it as new, thanks to the pseudo-parameter ?186485

Now, in order not to change all occurrences of our parameter each time, we will set it in a php file, which we will connect to all the places we need:



tell friends