Help me set up a simple 100% caching forward proxy for my home. I dicked around with squid and then apache+mod_proxy/mod_cache yesterday afternoon, and while they proxy beautifully
, they don't seem to have much cache hit even on static content- almost or completely 0% cache hit.
posted by hincandenza to computers & internet (5 answers total) 2 users marked this as a favorite
First, I am at work so I'm going to be a touch fuzzy on the details, and can't implement anything till I'm home.
I'm looking to set up a forward caching proxy for static content on my home network, mostly for browsing while on my home machines (not counting mobile devices like ipad, iphone, android) and largely from FF on Mac, which is my principle browsing option. That Mac is running FF with FoxyProxy Standard installed, and I set up rules such that "mostly" static content like jpg, gif, png, css, js and known static pages from whitelisted URLs/sistes (even those with queries that I can trust to be minimally volatile for my purposes) are sent to the proxy to hopefully be cached for a few days, and thus avoid the RTT/lag of loading from the original site on repeat visits or browsing around, especially given how flaky my Comcast is. This is especially useful for sites like imgur, which I visit more often than is healthy and has all of those thumbnail images on the home page. I also have a couple of GM scripts that do preloading for sites like Craigslist etc, which on page reload would really cut down on traffic generated outside of my router.
Anything not matching these whitelist rules of *.jpg, etc will bypass the proxy altogether and load as normal. And yes, I am well aware of the risks, but honestly I trust my instincts and web knowledge, and ability to one-click disable foxyproxy if I suspect erratic behavior. And no, the browser's default behavior is not caching nearly enough for my tastes.
Base machine is a 2008 Mac Pro running OSX 10.6 (Snow Leopard, I believe- definitely not Lion). I have VMWare Fusion 4 running a couple of Windows VMs and an Ubuntu 11.10 VM. I was setting up the proxies in my Win2k3 VM, simply because it was there and acts as little more than a VPN client for TS'ing to work, and tends to be running in the background as often as the Mac is powered on- which is to say, 24/7.
The ideal here, for my short term purposes, is a caching proxy on the Win2k3 VM or the Mac (the Linux is for experimenting, and is less stable/consistently there) where I can filter in the browser to effectively have a local disk cache that supplants the browser's in a way I can explicitly view and control.
FoxyProxy is working fine when enabled, as I see the traffic going to the proxy only for those whitelists, and pages continue to work fine.
With both Squid and Apache mod_proxy/mod_cache/mod_disk_cache, they seem to work great as proxies, and even seem to create cache files... yet even for urls that aren't parameterized such as http://site.com/static/images/1234abcd.jpg, they both show evidence of cache miss despite repeat visits. Even just clicking forward/back shows the browser requests the content anew, the proxy logs show a cache miss (TCP_MISS in Squid, the SetEnv/CustomLog trick in Apache, and Netmon 3.x to confirm the outbound re-request by the proxy for content it ostensibly cached). Some content does get written to the cache folder, but doesn't appear to be used- the cache miss ratio is almost 100% in Apache, and exactly 100% in Squid.
Squid was a snap to install and setup, but looking at its logs while it proxied, it was doing a TCP_MISS 100% of the time- despite the cache folder being populated with *some* content. I tried adjusting the refreshfilter and cache rules, and again this would result in content being written to disk... and then showing TCP_MISS in the logs 100% of the time on page reload (by reload I mean both F5, and simply revisiting the same URL in a new tab).
Because Apache is about as universal as it gets, I tried that after Squid failed, and it exhibited the same behavior: proxies for images fine, writes files to disk, so the browsing is seamless... but doesn't appear to actually use the cache on followup visits. I tried enabling just about every cache element in mod_cache including the multiple items to ignore certain headers and those that violate the HTTP standard and would normally be a bad idea if I wasn't whitelisting via FoxyProxy... but no dice: it still won't cache.
Basically, I want a 100% caching forward proxy that I can whitelist some types of traffic to (via FoxyProxy) and have them server from disk cache for N minutes/days (configurable) before expiring. Ostensibly, Apache should work fine for this, but while it's caching some files to disk, it doesn't then use them. I'd prefer to run the caching proxy on the Win2k3, but since I have Mac and Linux as options those would work as well- although the Linux is the most volatile as an OS, what with it being a VM and upgraded/rebuilt relatively often.