I understand why this happens and what to do. Browser caching woe.
August 16, 2017 10:17 AM Subscribe
Sometimes when someone goes to one of our older websites they get say yesterday’s image and wonder what’s up -- why does it look wrong. I explain that it “cached” and that they should refresh their browser or manually clear it. All’s fine – that works. They see the image that was loaded today.
But I don’t know the answer to “why does this seem to happen a lot on this site in particular and not on the other sites I look at a lot.” It does not appear to be an issue on newer sites I visit either.
Is there anything that could be done on our end (by an sysadmin say) to deal with this? Does it have anything to do with the fact that it is an old CMS?
The thing is the people don’t know that the old image is cached on their computers so they don’t even know it’s a problem unless the image and text don’t seem to match....and they don't know about caches...so they don't just refresh when things look odd.
But I don’t know the answer to “why does this seem to happen a lot on this site in particular and not on the other sites I look at a lot.” It does not appear to be an issue on newer sites I visit either.
Is there anything that could be done on our end (by an sysadmin say) to deal with this? Does it have anything to do with the fact that it is an old CMS?
The thing is the people don’t know that the old image is cached on their computers so they don’t even know it’s a problem unless the image and text don’t seem to match....and they don't know about caches...so they don't just refresh when things look odd.
Your web server can / may be configured to send "directives" telling the user's browser how long to keep the data- typically images- in it's local cache. These are (in the linux world, assuming you're using the Apache webserver) in the httpd.conf file, but they can be over-ridden by a local .htaccess file depending on your setup. You can check out the documentation of the mod_expires module to see how this works, but it will only apply if you're using apache.
If you are using chrome or safari, the developer tools will let you look at the network request that was made from the server, and the response. It will also show you whether the request went to the original server, or came from your local cache. There's a decent guide here.
posted by jenkinsEar at 10:30 AM on August 16, 2017
If you are using chrome or safari, the developer tools will let you look at the network request that was made from the server, and the response. It will also show you whether the request went to the original server, or came from your local cache. There's a decent guide here.
posted by jenkinsEar at 10:30 AM on August 16, 2017
Set your http headers:
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching
Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: must-revalidate
Cache-Control: no-transform
Cache-Control: public
Cache-Control: private
Cache-Control: proxy-revalidate
Cache-Control: max-age=
Cache-Control: s-maxage=
posted by at at 10:31 AM on August 16, 2017 [3 favorites]
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching
Cache-Control: no-cache
Cache-Control: no-store
Cache-Control: must-revalidate
Cache-Control: no-transform
Cache-Control: public
Cache-Control: private
Cache-Control: proxy-revalidate
Cache-Control: max-age=
Cache-Control: s-maxage=
posted by at at 10:31 AM on August 16, 2017 [3 favorites]
Using cached images is a time- and bandwidth-saving issue. Forcing a reload every time you visit the page means re-downloading every image and bit of code - and if nothing's changed, that's a waste of time that can cost you money. So older browsers, especially, were designed with the thought that clearing the cache would be rare; you'd only do it when you needed to.
Websites with dynamic content are set to auto-refresh at least part of the code, so you'll see the new ad and the little popup that says "you've scrolled down half a page - would you like to subscribe now? No? How about now?"
Sites built on limited resources don't arrange for auto-updating, because that's a drain on the site's servers as well; they cope with a number of complaints of "I can't see the new stuff" and staff or other users get to say, "here's how to clear your cache/refresh the page."
posted by ErisLordFreedom at 11:43 AM on August 16, 2017
Websites with dynamic content are set to auto-refresh at least part of the code, so you'll see the new ad and the little popup that says "you've scrolled down half a page - would you like to subscribe now? No? How about now?"
Sites built on limited resources don't arrange for auto-updating, because that's a drain on the site's servers as well; they cope with a number of complaints of "I can't see the new stuff" and staff or other users get to say, "here's how to clear your cache/refresh the page."
posted by ErisLordFreedom at 11:43 AM on August 16, 2017
In general, the only reliable way to have a best-of-both-worlds where you have cached images that change when you need to update them is to make sure that the updated image is accessed with a different URL altogether. So, either you upload a new image and update the references to that image, or you do something like add a fake query parameter to the end of the URL to the image. HTTP Etag headers can mitigate this somewhat, but they don't always work in the face of aggressive caching, and they still require an HTTP connection to verify the cache, even if the server can reply with a quick "yep, your cache looks good" response.
So, generally, the sites that do this well are the ones that have put the time and effort into putting together a good system for reliably breaking the cache whenever they need to update, whether that's by versioning the URLs or by tweaking the caching headers to be less aggressive at the cost of extra bandwidth. As you've noticed, it's pretty easy to do this wrong.
posted by Aleyn at 12:01 PM on August 16, 2017 [6 favorites]
So, generally, the sites that do this well are the ones that have put the time and effort into putting together a good system for reliably breaking the cache whenever they need to update, whether that's by versioning the URLs or by tweaking the caching headers to be less aggressive at the cost of extra bandwidth. As you've noticed, it's pretty easy to do this wrong.
posted by Aleyn at 12:01 PM on August 16, 2017 [6 favorites]
You can also use ETags in combination with other settings to have the browser cache the files but check to see if they've changed each time it displays them, which will cut down on you and your user's bandwidth usage and make the page faster if they're returning and the cached image hasn't changed.
posted by Candleman at 3:40 PM on August 16, 2017
posted by Candleman at 3:40 PM on August 16, 2017
I agree with Aleyn -- the best fix for this is to change the image URL each time the image changes.
You can do this with the query string if you don't want to or can't change the image file name.
For example, have your page use an image tag like
<img src="main-image.jpg?date=2017-08-17" />
and get the server to change the date in the query string each day. This technique is sometimes called a "cache buster" querystring.
posted by richb at 2:59 AM on August 17, 2017 [2 favorites]
You can do this with the query string if you don't want to or can't change the image file name.
For example, have your page use an image tag like
<img src="main-image.jpg?date=2017-08-17" />
and get the server to change the date in the query string each day. This technique is sometimes called a "cache buster" querystring.
posted by richb at 2:59 AM on August 17, 2017 [2 favorites]
This thread is closed to new comments.
By timeout, I mean the time that a cache is allowed to keep using the old version before it needs to look for a new version of the file.
posted by triscuit at 10:27 AM on August 16, 2017 [1 favorite]