Proxying: How do we Stop Worrying and Love the Squid?
April 2, 2012 11:02 AM Subscribe
Is Squid the Right Solution in a Windows Ecosystem? Recently, I was involved in giving a demonstration of some of our company's Web Services at a customer's site. Our services needed to go out and send HTTPS requests to other companies. Normally, external network access and hardening is left up to the customer. Between firewalls and politics, it was left to me to fix using one box in the DMZ. I had some experience maintaining a personal Squid proxy/cache (for sites such as Slashdot/Pandora), so I got approval to stand one up...
The demo is over, and now I am receiving questions about Squid that I can't easily answer. I've done some basic sanity checking on the squid.conf file: mostly explicit lists of src/dst hosts/ports.
What is the best resource for hardening on the squid configuration file? (We want to follow best practices)
Where can I find some good examples of folks using Squid in production environments? (Our customer doesn't want to use things that are unproven)
Are there better HTTP Proxy utilities that I should be considering? (The customer is in general, a Mircosoft shop. This is probably the biggest detractor to Squid for them.)
Where can I read on how to design a Squid configuration that is primarily redundant and secondarily scalable?
Finally, what else should I be considering when researching secure HTTP Proxying?
The demo is over, and now I am receiving questions about Squid that I can't easily answer. I've done some basic sanity checking on the squid.conf file: mostly explicit lists of src/dst hosts/ports.
What is the best resource for hardening on the squid configuration file? (We want to follow best practices)
Where can I find some good examples of folks using Squid in production environments? (Our customer doesn't want to use things that are unproven)
Are there better HTTP Proxy utilities that I should be considering? (The customer is in general, a Mircosoft shop. This is probably the biggest detractor to Squid for them.)
Where can I read on how to design a Squid configuration that is primarily redundant and secondarily scalable?
Finally, what else should I be considering when researching secure HTTP Proxying?
Best answer: http://www.linkedin.com/answers/technology/web-development/TCH_WDD/86991-2946399
Squid is the Apache of caching. It isn't the newest, it isn't the flashiest, it isn't the fastest, but it is extremely solid and mature code.
Redundancy in a Squid environment can be handled in numerous ways, and is largely dependent upon the manner used to acquire/intercept requests. As a basic example, the use of a PAC file (including wpad, etc) might simply use a simple algorithm to pick two servers and return them. All clients making use of the same file would cause strong affinity for particular URL's getting routed to particular Squid servers, and would be very effective. However, this assumes that at least a little configuration on each client is possible. Interception of all port 80 traffic, redirected through a load balancer to your Squid cluster, is also another option, though you need an intelligent load balancer capable of analyzing the request if you want any sort of affinity to a particular server for a given request. For the specific case of secure HTTPS proxying, the details of your networks and clients become more important, and it is probably worth engaging the Squid mailing lists on the topic.
posted by jgreco at 8:19 PM on April 2, 2012
Squid is the Apache of caching. It isn't the newest, it isn't the flashiest, it isn't the fastest, but it is extremely solid and mature code.
Redundancy in a Squid environment can be handled in numerous ways, and is largely dependent upon the manner used to acquire/intercept requests. As a basic example, the use of a PAC file (including wpad, etc) might simply use a simple algorithm to pick two servers and return them. All clients making use of the same file would cause strong affinity for particular URL's getting routed to particular Squid servers, and would be very effective. However, this assumes that at least a little configuration on each client is possible. Interception of all port 80 traffic, redirected through a load balancer to your Squid cluster, is also another option, though you need an intelligent load balancer capable of analyzing the request if you want any sort of affinity to a particular server for a given request. For the specific case of secure HTTPS proxying, the details of your networks and clients become more important, and it is probably worth engaging the Squid mailing lists on the topic.
posted by jgreco at 8:19 PM on April 2, 2012
Response by poster: Thanks for the replies guys!
@feloniusmonk - I'm assuming their IT people are going to come back with Forefront, assuming they take ownership of the issue.
@jgreco - I had totally blanked on using the Squid mailing list. Most likely we will suggest a (software) HTTP Load Balancer to the cluster. (Unless the customer has a hardware one we can use.)
As another aside, we are not caching any of the data within Squid, so we don't need to worry about which URLs are fetched by what instance. The only intelligence we would require would involve "how busy are you" metrics.
posted by redsai at 8:47 AM on April 3, 2012
@feloniusmonk - I'm assuming their IT people are going to come back with Forefront, assuming they take ownership of the issue.
@jgreco - I had totally blanked on using the Squid mailing list. Most likely we will suggest a (software) HTTP Load Balancer to the cluster. (Unless the customer has a hardware one we can use.)
As another aside, we are not caching any of the data within Squid, so we don't need to worry about which URLs are fetched by what instance. The only intelligence we would require would involve "how busy are you" metrics.
posted by redsai at 8:47 AM on April 3, 2012
Think you might find Varnish a better performer then. Squid's got the caching edge.
posted by jgreco at 6:05 AM on April 4, 2012
posted by jgreco at 6:05 AM on April 4, 2012
Even though I'm late to the party here, I'd like to second the suggestion to use TMG. As has been said, it's an MS product, so your corp., IT may prefer it. I've been using it in high-volume production roles for years - back since ISA 2004 - and it's never let me down.
It can actually do more than both Squid/Varnish, but not as quickly. Then again, we're talking the difference between 900 MPH and 975 MPH, so unless you are a performance stickler, this shouldn't matter too much.
TMG can do your load balancing, it can do in-line threat protection, reverse cacheing, multiple-server availability clusters, and one of my favorite features - HTTPS proxy inspection (inbound and outbound). It's very easy to set up a single HTTPS listener with a wildcard certificate (like *.mydomain.com) and be able to publish multiple websites on one IP, all protected with SSL.
I only have two complaints with TMG over the years:
1. No support for WCCP (this is an MS limitation b/c of the GRE tunnel requirement)
2. It's not free.
posted by bfu at 7:52 AM on April 4, 2012
It can actually do more than both Squid/Varnish, but not as quickly. Then again, we're talking the difference between 900 MPH and 975 MPH, so unless you are a performance stickler, this shouldn't matter too much.
TMG can do your load balancing, it can do in-line threat protection, reverse cacheing, multiple-server availability clusters, and one of my favorite features - HTTPS proxy inspection (inbound and outbound). It's very easy to set up a single HTTPS listener with a wildcard certificate (like *.mydomain.com) and be able to publish multiple websites on one IP, all protected with SSL.
I only have two complaints with TMG over the years:
1. No support for WCCP (this is an MS limitation b/c of the GRE tunnel requirement)
2. It's not free.
posted by bfu at 7:52 AM on April 4, 2012
This thread is closed to new comments.
posted by feloniousmonk at 2:59 PM on April 2, 2012