I understand why this happens and what to do. Browser caching woe.
August 16, 2017 10:17 AM   Subscribe

Sometimes when someone goes to one of our older websites they get say yesterday’s image and wonder what’s up -- why does it look wrong. I explain that it “cached” and that they should refresh their browser or manually clear it. All’s fine – that works. They see the image that was loaded today.

But I don’t know the answer to “why does this seem to happen a lot on this site in particular and not on the other sites I look at a lot.” It does not appear to be an issue on newer sites I visit either.

Is there anything that could be done on our end (by an sysadmin say) to deal with this? Does it have anything to do with the fact that it is an old CMS?

The thing is the people don’t know that the old image is cached on their computers so they don’t even know it’s a problem unless the image and text don’t seem to match....and they don't know about caches...so they don't just refresh when things look odd.
posted by Lescha to Technology (7 answers total) 3 users marked this as a favorite
 
There are timeouts you can set in files -- that your webserver would set in an http header when it serves the file. It sounds like the timeout your old cms is using is too long. How to change that behavior would be specific to the software you're using, but should be changeable somewhere.

By timeout, I mean the time that a cache is allowed to keep using the old version before it needs to look for a new version of the file.
posted by triscuit at 10:27 AM on August 16, 2017 [1 favorite]


Your web server can / may be configured to send "directives" telling the user's browser how long to keep the data- typically images- in it's local cache. These are (in the linux world, assuming you're using the Apache webserver) in the httpd.conf file, but they can be over-ridden by a local .htaccess file depending on your setup. You can check out the documentation of the mod_expires module to see how this works, but it will only apply if you're using apache.

If you are using chrome or safari, the developer tools will let you look at the network request that was made from the server, and the response. It will also show you whether the request went to the original server, or came from your local cache. There's a decent guide here.
posted by jenkinsEar at 10:30 AM on August 16, 2017


Set your http headers:

https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching

Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: must-revalidate
Cache-Control: no-transform
Cache-Control: public
Cache-Control: private
Cache-Control: proxy-revalidate
Cache-Control: max-age=
Cache-Control: s-maxage=

posted by at at 10:31 AM on August 16, 2017 [3 favorites]


Using cached images is a time- and bandwidth-saving issue. Forcing a reload every time you visit the page means re-downloading every image and bit of code - and if nothing's changed, that's a waste of time that can cost you money. So older browsers, especially, were designed with the thought that clearing the cache would be rare; you'd only do it when you needed to.

Websites with dynamic content are set to auto-refresh at least part of the code, so you'll see the new ad and the little popup that says "you've scrolled down half a page - would you like to subscribe now? No? How about now?"

Sites built on limited resources don't arrange for auto-updating, because that's a drain on the site's servers as well; they cope with a number of complaints of "I can't see the new stuff" and staff or other users get to say, "here's how to clear your cache/refresh the page."
posted by ErisLordFreedom at 11:43 AM on August 16, 2017


In general, the only reliable way to have a best-of-both-worlds where you have cached images that change when you need to update them is to make sure that the updated image is accessed with a different URL altogether. So, either you upload a new image and update the references to that image, or you do something like add a fake query parameter to the end of the URL to the image. HTTP Etag headers can mitigate this somewhat, but they don't always work in the face of aggressive caching, and they still require an HTTP connection to verify the cache, even if the server can reply with a quick "yep, your cache looks good" response.

So, generally, the sites that do this well are the ones that have put the time and effort into putting together a good system for reliably breaking the cache whenever they need to update, whether that's by versioning the URLs or by tweaking the caching headers to be less aggressive at the cost of extra bandwidth. As you've noticed, it's pretty easy to do this wrong.
posted by Aleyn at 12:01 PM on August 16, 2017 [6 favorites]


You can also use ETags in combination with other settings to have the browser cache the files but check to see if they've changed each time it displays them, which will cut down on you and your user's bandwidth usage and make the page faster if they're returning and the cached image hasn't changed.
posted by Candleman at 3:40 PM on August 16, 2017


I agree with Aleyn -- the best fix for this is to change the image URL each time the image changes.

You can do this with the query string if you don't want to or can't change the image file name.

For example, have your page use an image tag like

<img src="main-image.jpg?date=2017-08-17" />

and get the server to change the date in the query string each day. This technique is sometimes called a "cache buster" querystring.
posted by richb at 2:59 AM on August 17, 2017 [2 favorites]


« Older What is this emotional-relational process called?   |   How do I finish this dollhouse? Newer »
This thread is closed to new comments.