Network admin / geek question: I'm in the process of building a small cluster of servers for my quickly growing wallpaper website, but I've never charted those waters before. I found a nice article on TorrentFreak about Mininova's server setup, and found it quite intriguing. My question is, are there any other big-ish sites like Mininova that are open about their hardware infrastructure? Googling around yielded few results. I'd be interested to see the setup for a site like Digg, MeFi, or Flickr. Thanks in advance.

Currently, I have a single Dual P3 load balancer, four DL360 (each 2x 2.8Ghz xeon) servers for apache, one DL560 (quad 2.8 xeon) server for MySQL and NFS, and a few misc. servers.

Do most large sites use NFS to share the website source? Any idea how much load NFS puts on a server? Also, any idea how much overhead a load balancer requires to route connections? Can a dual P3 1.6Ghz handle a 15mbit'ish site?

Any links or advice would be appreciated.

A 486 could easily push 10MBPS as a load balancer. Don't worry at all about that. ;)

What you might look into after the load balancer is adding a caching layer that uses Squid in reverse-proxy mode to take the load off of the apache servers.

Wikipedia is very open about their structure -- since they're a nonprofit. ... that's the largest open site I know. Another good source is Livejournal's Brad Fitzpatrick and his presentations on how they scaled their organization up to zomgwtfbbq.
Oh, and so far you're doing pretty much what I've been doing for my high-availability setups ... c'ept I usually integrate squid somehow and also usually use memcached to store sessions instead of relying on nfs not to screw up. (But I don't like nfs that much... our setups are stable right now, but around 2000 I had some serious problems with some NFS setups and it soured me on the technology.) What I generally have is a set of dynamic servers with php/apache, and a set of static asset servers running lighttpd and just pushing content.

And NFS, if it's placed on a dedicated box (We usually use solaris for this due to ZFS) can push a gig a sec easy as long as it's got a processor core and I/O capable of that.
This recent thread is very much worth reviewing. Key link:
Read plenty on Amazons setup. If you like free, you can L4 with ipfilter, and l4ip. :)
One of the Youtube people gave a talk about their infrastructure. Here's the video of the talk.
