AMD or Intel?
November 14, 2005 1:40 PM   RSS feed for this thread Subscribe

To get the best performance per dollar, do I want an Intel Pentium D or an AMD Athlon 64 X2 for my new DB+Compute server?

I am contemplating building a machine for a project that involves numerical computations on large data sets. The code is written in perl, the database is MySQL, and the whole thing runs on Linux. The current code isn't parallel aware, but it would be trivial to divide the data set in half, and run two compute processes (or more), so I think a dual core CPU will get me more cycles per dollar.

Nearly all of the benchmark sites are geared towards gamers, photoshop and video users. I don't care about any of these things, so the results don't help me much.

If I get an AMD, it would be the Athlon 4400+, which is the first X2 CPU that has dual 1MB cache. I can't tell if an Intel 840 EE is comparable to the 4400+, or an 820. Obviously, if the 840 EE is on par with the 4400+, then the AMD part is the best value. I'm tempted to get an Intel 820 or 830, but would prefer not to get locked into a series of much slower processors.

My code isn't 64-bit aware, nor do I need more than 4GB RAM, so Intel's allegedly poor implementation of AMD64 shouldn't be a problem, nor should AMD's superior instruction set be an advantage.
posted by b1tr0t to computers & internet (18 comments total)
Look closely at any parallelized benchmarks. (IIRC, some of the Photoshop plugins are multi-threaded and you can see this clearly in tests.) It's widely acknowledged that AMD is providing much better bang for buck these days. I also believe that the amd64 linux ports are developing rapidly, and gcc is getting faster on this architecture. I would definitely buy an amd-made amd64 box. (However, six or seven months ago I found that a 3GHz 2MB cache Xeon dell was my best buy.)
posted by wzcx at 1:54 PM on November 14, 2005


I have built XPC Shuttles based on both processors.. and pretty much the AMD wins hands down on almost everything. On *SOME* graphics apps like Avid (video editor) had a few optimizations for the PIV so it, in some operations ran a little better than the Athlon box. I have read (but not experienced) that the 840EE is better with apps that spawn a zillion threads.. but overall.. the Athlon kicks total butt.. and comsumes a lot less power to boot.
posted by cowmix at 1:56 PM on November 14, 2005


The benchmark waters seem fairly muddy, but one thing is for sure: the Pentium Ds suck down about one and a half times as many watts as comparable Athlon 64 X2s, meaning higher electricity costs and cooling requirements.
posted by zsazsa at 2:00 PM on November 14, 2005


Probably the Athlon. But considering this is a DB server and that your datasets are "large" (larger than memory?), you might want to spend some extra $$ on non-CPU things. Like proper 15k RPM SCSI hard drives. They make a huge difference to DB performance under load because of the reduced latency and because they tend to seek much faster than most IDE drives (3ms vs 8ms).
posted by polyglot at 3:28 PM on November 14, 2005


AMD is definitly the way to go for a db server... I know I have read articles comparing benchmarks of AMD vs Intel servers. But I can't remember any exact articles... But I would bet that it was on one of these sites... Tom's Hardware or Anandtech

I read a lot of most of the articles on both of these sites and find both of them to be pretty good...
posted by kashmir772 at 4:00 PM on November 14, 2005


To veer slightly off topic, if you're concerned about speed in numerical computing, you'd save a lot of time by profiling your code and rewriting the most time-consuming part or parts (which would, it is to be presumed, include the actual number crunching) in C or C++ and using SWIG or XS to glue it into your Perl. (Unless you're just using CPAN modules that already do that.)
posted by Zed_Lopez at 4:38 PM on November 14, 2005


The first version of my code had quadratic DB access complexity. That is, for a dataset of size N, I had to hit the DB disk N*N times. When I rewrote the software to have linear DB access (N accesses for a dataset of size N), execute time dropped from 11 hours to 3 hours. Of the three hours of run time, about 10 minutes is spent in accessing the DB. This was on an Athlon 2600+ with 512 MB RAM and and IDE disk.

Further optimization of the algorithm might be possible, but I actually want to try out more statistical tests on my data. So I want a machine fast enough to write some new code (several hours) then execute, write more code, execute, etc.

To keep costs down, I'll get either a Raptor (10k SATA) and a 300MB disk, or just get a single WD Caviar RE2. Since such a small percentage of the run time is disk access, the high cost of SCSI (and low capacity) won't buy me very much performance. I'll probably toss in a PATA drive for the OS.
posted by b1tr0t at 5:11 PM on November 14, 2005


It's a little weird to see this question here, as I've been looking into the exact same questions... But, on a custom database project, so no chance our target use is the same. At any rate, after wearing out my eyeballs on Web sites the last few days, and reading way too many vendor tech sheets, here's what I think I've learned:

If your workload is largely computational, and can be run advantageously in a 64 bit environment (lot of floating point, memory intensive, etc.) without much I/O, AMD and lots of memory are your friends. But if you are I/O bound, Intel's chipset muscle offers better ways of clobbering those gremlins. But I've lately concluded, after many hours of review and study of tech documents, there are good reasons why workstation vendors aren't packing up their dual and multi socket Xeon and Opteron tents just yet.

The 840EE is a lotta processor, and a lot of money. More than double the cost of the 4400+ x2, but with HT enabled, it looks and acts like 4 execution cores to multi-threaded apps, as the 4400+ x2 never can. If you can load up its caches, and keep its pipelines full, its cores hyper-threading, and not melt your system, the 840EE will do a whopping amount of work. You see that in benchmarks suites that spawn a lot of threads and are parallel, like video encoding. As THG put it:
"Thus, when making a purchasing decision, the question to ask is whether or not multiple applications will be running simultaneously. If the answer is yes, then the Intel Pentium 840 EE is your first choice. Otherwise, the AMD Athlon 64 X2 4800+ will give you much better performance for single applications."


As for the power cost difference between the platforms brought up here by zsazsa, it isn't economically significant enough to be a decision factor.

But the application you describe seems much more a transactional workload than it is like the CPU test results you are finding on most enthusiast sites. If you are hammering a database all the time, efficient I/O has got to be your concern. So, the comparison of processors drops in importance to the overriding consideration of getting a balanced solution. In that regard, your question is more Pentium vs. Xeon and Athlon vs Opteron, that Intel vs AMD.

That's because I/O is still a workstation strength, and likely to remain so for a while. The reason is that there are just more options for southbridge components in workstation chipsets than there are in the desktop space, and particularly where high I/O is important, you have to understand that. In the desktop world, RAID 0 SATA seems sufficient for most people, but if you need more I/O than that, your choices in the 939 world fall to nothing after your arrays are larger than the 4 disks most 939 chipsets can deliver. There is just no good way into a 939 socket other than SATA. As soon as you need more than 4 disk spindles in a RAID 10, you are effectively out of the 939 market, and into the 940 workstation space. But the same is true, generally, for the Intel platform, where dual cores are nice for PC's, but the bread and butter of the Xeon crowd is using Intel's 72xx chipset silicon with multi-socket Xeons as front ends for big RAID clusters and SAN solutions.

In the workstation world, it is not only spindles, but lanes and XOR silicon that rules, since distributed parity schemes like RAID 5 and 50, and RAID 6 are important, as are the management and the bidirectional throughput of fast and wide legacy buses like PCI-X, as a recent THG article describes. PCI-E may be the coming thing, but as it is implemented today on desktop motherboards in SLI video applications, it is a high speed one way road to a display screen, not a general purpose data bus. Even in its top of the line SLI nForce4 chipset, which has 20 programmable lanes, nVidia acknowledges this, since the usual layout on most motherboards in the 939 socket family default to 16 + 4 organizations in single card mode, and 8 + 8 only in SLI, and those lanes are one way, from the processor to the graphics slot, with, usually, only a single control lane back. That's because the market on the desktop for SLI is video and rendering, not whacking datasets...:-)

By workstation and server standards, desktop I/O is pretty lightweight and "bursty". Even the folks doing video capture and editing have such highly asymmetric I/O requirements that simple RAID 0 arrays of a few disks keep 'em happy. So, neither nVidia in nForce or Intel in ICH7R bother to put an XOR engine in desktop silicon. If you setup RAID 5 arrays, you immediately get processor loading on desktop boards, as the CPU is doing all the parity work, to say nothing of the saturation of the FSB that follows shortly in a transactional situation.

So, my advice is to look pretty carefully at your I/O requirements before getting involved too deeply in the CPU wars. It's probable that you'll be looking dual socket more than dual core very quickly. No point in getting all excited about the pretty new toys being dangled about, if you have to go to work...

On preview, I see that you've characterized your problem as a small computational issue. Go AMD Athlon, and go for clock speed not cache size. Your issues are very like those of the gamers.
posted by paulsc at 5:30 PM on November 14, 2005


as it is implemented today on desktop motherboards in SLI video applications, it is a high speed one way road to a display screen, not a general purpose data bus. Even in its top of the line SLI nForce4 chipset, which has 20 programmable lanes, nVidia acknowledges this, since the usual layout on most motherboards in the 939 socket family default to 16 + 4 organizations in single card mode, and 8 + 8 only in SLI, and those lanes are one way, from the processor to the graphics slot, with, usually, only a single control lane back.

Do you have a reference for this, I would like to read more about it. Everything I have read (as limited as that is) says PCI-E lanes are full duplex. I did notice that nvidia has released a 38(amd)/40(intel) lane nforce4 chipset. Of course these may still be 'one way', I guess.
posted by Chuckles at 8:28 PM on November 14, 2005


It is interesting that no one has come out strongly in favor of Intel CPUs.
posted by b1tr0t at 8:34 PM on November 14, 2005


Also, b1tr0t, you are asking about best performance per dollar, but you don't give us any other context on how much you want to spend. You can get a used MPX2 or dual cpu i875 based system with 66MHz/64bit PCI for $500 or less - yes, very old, but great performance per dollar if you actually want to minimize cost.
posted by Chuckles at 8:39 PM on November 14, 2005


Everything I've seen suggests that PCI Express lanes are bidirectional. That motherboards offer slots ranging from one to sixteen lanes, including 4- and 8- lane slots strongly suggests that other I/O devices will become more widely available. Until gigabit ethernet got cheap and came to the desktop, there weren't many desktop perhipherals that demanded as much bandwidth as the graphics subsystem. Ordinary PCI can't handle a single GigE port, much less the quad port cards that custom routers want (go open up a Cisco PIX 525 - its just a PC with some custom Cisco software on it). large RAID and fiber channel cards are another good application for multi lane PCI Express cards.

If someone wanted to start, say, an iSCSI NAS company, they could probably do so with almost no capital. All the parts are available, only a little custom software is needed. Marketing would be the biggest expense. Five years ago, there would be a lot of custom hardware interfacing to design. Today everything is available off the shelf.
posted by b1tr0t at 10:56 PM on November 14, 2005


"Everything I've seen suggests that PCI Express lanes are bidirectional. ..." posted by b1tr0t at 10:56 PM PST on November 14 [!]"

That's not the way PCIe generally works, according to what I've read. Essentially, each "lane" in a "bidirectional" PCIe "bus" (which is probably better thought of as a "lane group" than a "bus" anyway) consists of a "pair of pairs" of conductors, with one pair in the lane acting as transmit, and the other acting as the receive (viewed from a perspective of a device at one end of a link). And PCIe is a point to point serial link, so part of the deal in implementing PCIe is "steering" lanes to a particular device, and setting up its communication protocols in layers above the physical link layer. This greatly simplifies the complexity of the physical layer, and lowers its cost, while giving an implementers a lot of control about how he chooses to do things. But there is nothing that says that implementations of PCIe must inherently be symmetrical for all lanes, and for implementations of functions like video, which is mainly one way, there is little point in a board manufacturer taking up the real estate for extra lanes that have little use. Thus, in the most common nVidia SLI video solution, you have a specialized "bus" which is made up of 16 transmit lines, and 4 receive lanes. The transmit "lanes" can generally be either all sent to one slot for a single card solution, or split into 2 sets of 8 "lanes" for 2 slot SLI applications. In the single card case, the "bus" then defaults to an effective x16 width on the populated slot, while the empty one remains x1, waiting around to see if someone is ever going to stick a card in it, and letting the system know when that happens. In a 2 card SLI setup, the bus is "divided" into two groups of 8 "lanes" which are sufficient for the throughput needs of the video cards, and each slot has an active x1 "control" line back to the system, which is plenty for status and timing messages in this highly asymmetric application.

So, in the case of a single card x16 solution, you'd think that the empty slot would have at least some utility as an x2, x3 or x4 slot, but in fact, the nVidia firmware in many common desktop boards is pretty single minded about what these slots can be used for, and that is only "graphics." Inserting a video card in the x16 slot configures the other slot, either with a physical device like a jumper or a paddle, or in software, as an x1 slot only, whether you like it or not. And with good reason, since in most of these desktop implementations, there aren't underlying matching transmit control lines for each lane anyway, or firmware back at the system end to handle configuration and operation of these lines as a general purpose I/O bus.

I know this from some discussions I had last week with LSI support engineers, regarding problems I'd read about with using their LSI00008 PCIe U320 SCSI controller with Asus A8x boards. Turns out that the LSI board needs 8 PCIe lanes, and says so, but in a working single board video system, 16 of the 20 available lanes in the nVidia firmware are default routed only to the slot populated with the video card, so the LSI board only sees the remaining x1 watchdog lane, and can't work. Even if you force switch the motherboard to x8 + x8 SLI mode, no go, because, again, it's actually a highly asymmetric video "bus." So, I asked them if they knew of any motherboards where their controller was successfully being used, and they popped right back with "Sure, the Tyan K8WE." You'll note that Tyan makes a big deal out of their board having "Dual PCI Express x16 slots with FULL SPEED x16 lanes on each slot" and rightly so. They gone to the trouble and expense of putting down all those symmetrical lanes missing on the desktop boards, and expect to expose anything that starts sending signals on them to the rest of the system via their PCIe bridge. Dandy, no? But, of course, this is a dual Opteron board, and such niceties are expected by the workstation crowd.

Back in the desktop space, where no good idea is left laying if it can be further tortured into unit volume profit, Gigabyte has introduced "frankenSLI" boards. And Asus is busy making SLI with "dual x16" capabilities. But don't get your hopes up for using these products for something else than video. Again. these are implementations of PCIe as video buses, not general purpose expansion slots.

Personally, I think the whole PCIe thing is going to create a huge amount of confusion in the minds of consumers. One of the beauties of the classic PCI bus we've all come to know and love and hate is that PCI devices "just work" because the classic PCI bus was implemented as a general purpose expansion bus, and we all used it without much regard for physical connector conformance or device contention. We as end users didn't have to know much, or excercise any restraint, because if we could shove in a PCI slot, it was bound to work, since the board designers were obligated to make it so, under the constraints of remain compliant with the PCI spec. Motherboards with 5 slots had to power 5 cards successfully, and handle signaling at standard rates from 5 cards, etc. Easy peasy, for the end user, at least.

Not so with PCIe. PCIe, being a point to point physical layer, necessarily carries an implementation burden through to the device designer, and to the end user, with regards to lane steering and bandwidth use, that can't be so easily taken care of as it was in the old PCI PNP days. If a motherboard maker puts in a 3 slot x16 general purpose expansion "bus," (thinking reasonably that some users will want things like, say, an x8 second video card, a x4 super sound and media expansion card, and an x4 "smart home" controller in their PC) and the customer tries 3 cards with x16 needs each, two of them can't work, even if looks like physically, they could. If a motherboard maker puts in a physical x16, x8, x4 and x1 "expansion bus" implementation, to be sure he is reserving the right number of lanes for anything that will physically plug in, the manufacturer is potentially wasting a lot of cost and real estate for slots that may never be used, and the customer is going to have to be sure he has an available slot (of the right width or greater) available for whatever device he wants to add. On top of all that, for those who are watching the desktop chipset wars, where more and more of what were once expansion functions are routinely included in chipset silicon, the expansion board market doesn't look rosy, anyway. So, you may see desktop boards with one or two x16 slots, each of which can take anything up to x16, but that's about it. Of course, without a lot of slot to fill, people will probably buy fewer expansion cards, and expect more in "standard" PC features.

So, much as I wish it weren't so, I see PCIe as putting an end to people easily customizing their machines, instead of it working to promote this, simply due to confusion on the part of end users. It would have been better if the whole thing were marketed as something like "Express Lane" from the outset, and if a general expansion bus implementation had been agreed, to build on the experience of millions of users around the world with Plug and Pray PCI. Maybe that can still happen, but at the moment, in the valley of the blind, the one eyed man is king...
posted by paulsc at 12:29 PM on November 15, 2005


I'm not sure that expansion beyond adding graphics cards is something that the average user wants.

Back in the day, I remember building PCs that consisted of:

1. Mobotherboard + CPU + RAM
2. Video card (VESA Local Bus, usually)
3. Super I/O card (ISA; IDE, Serial, Parallel, 16550 UARTS if you were lucky)
4. SoundBlaster 16 (ISA)
5. Modem (ISA)

Today, you can get tons of motherboards with everything but video integrated. The better boards even include pseudo RAID (hardware that claims to be raid, but requires a significant software component to work).

I don't think the average user needs raid or would know what to do with it, so the lack of bandwidth for big raid cards doesn't seem to be a huge loss.

Many boards have gigabit ethernet integrated, sometimes even two ports, so that's another device that the average user won't need to add.

I was excited about PCI express because is looked like it might be a sneaky way to prototype some really cool embedded systems. It sounds like that is not the case now, but I could see NVIDIA coming out with chipsets that feature multiple 16x16 slots and marketig the hell out of them to upsell to eager buyers.

I don't know why a sound card or home automation card would need more than a single lane. USB's 12 megabits per second is plenty for a four port audio card, so I have a hard time believing that a PCI express lane is insufficient for the most exotic sound device you can come up with.
posted by b1tr0t at 5:27 PM on November 15, 2005


It is worth remembering that just 1 PCI-E link is twice as fast as a 33/32 PCI bus, and full duplex. So, even the 16x down and 4x up of the nvidia chipset has some pretty serious bandwidth capability, at least equal to 133MHz/64bit PCI. Theoretically you should still be able to plug in a PCI video card and then use the graphics slot for any PCI-E card that works on 4 links or fewer. paulsc's information is intriguing though, I guess you don't know for sure if it will work until you try it - I don't know why that surprises me.

For the record, the best place I know of for information about high performance computing on the cheap is the www.2cpu.com forum, a pretty impressive place.
posted by Chuckles at 6:13 PM on November 15, 2005


"...I don't think the average user needs raid or would know what to do with it, so the lack of bandwidth for big raid cards doesn't seem to be a huge loss. ..."

:..I don't know why a sound card or home automation card would need more than a single lane..." posted by b1tr0t at 5:27 PM PST on November 15 [!]

Unfortunately, the "average user" has never experienced an immersive, ubiquitous computer supported environment. The Enterprise Computers (both "characters" voiced by Majel Barrett, Gene Roddenberry's wife, and an occasional on screen player, too, in both series, FWIW) from Startrek are the thing most people bring up when I get into conversations about this. Others toss in what they remember about HAL 9000 from 2001: A Space Odessy. Most people seem to think it would be strange but maybe cool to control the lights in a room on voice commands, and some few thousands of people around the world have actually set this up in their homes, I guess. Few who understand anything about SQL or relational databases imagine most people constructing decent queries on the fly by speaking to the Unseen Machine, although the common experience of Google is that you don't have to know much to learn a lot, if you're willing to trust PageRank, and keep banging away at a topic. I understand all this, and agree that the technologists among us have failed to help us imagine anything better or more useful than these fairly campy bits from old space operas.

Too bad for all of us, because until we get speech recognition working really well, and other stuff like predictive behavior analysis in realtime AI all sorted out, we're going to have a hard time making it part of the needs hierarchy of the cool kids. It's a classic chicken and egg problem. Without the interfaces and I/O methods for automated environmental interaction being standardized and mass produced, it's a huge gamble to try to do systems development. Yet without working applications and software systems, there is no incentive for anyone to start working to make the hardware to implement such environments. And even though Googling the phrase "immersive computing" now gets a return set of over half a million hits, it's still pretty much a pipe dream.

But bringing all this back to this thread, I do think that high bandwidth I/O ports on personal computers are such a basic need, we can hardly notice that we don't really have good ones. USB has lived not because it was good, or really, for a long time, all that useful, but because sooner or later, everyone needs to get something into or out of a PC without breaking a sweat, and with all its warts, USB has been the prime "solution" to these low level, but broadly experienced needs.

I'd just hoped PCIe would open up a bigger world than wire tethered USB could ever hope to do. But, it doesn't look to me that it is headed in that direction, with any overarching method in the marketplace madness. YMMV, and my crystal ball has been badly cracked since about the time Richard Nixon was elected President, but PCIe looks like a missed opportunity in any world larger than XVGA dimensions...
posted by paulsc at 7:13 PM on November 15, 2005


I suppose if you want to build star-trek style devices, you might need a lot more functionality than is currently present in desktop busses. But you may need a lot more CPU, memory, and disk as well.

I eagerly anticipate quantum computing and the possibility that NP problems may become P. Until then, I'm happy to continue punching the clock and building relatively boring computing devices that enough people seem to find very useful.

If I was to put my futurist hat on, I would look to molecular-scale computers and machines. They could possibly solve all sorts of energy problems, ranging from a safe and cheap hydrogen economy to safe and cheap fission and fusion. That will speed up entropy and the race to Three Degrees Above Zero, but it will be a fun ride!
posted by b1tr0t at 8:27 PM on November 15, 2005


WebHostingTalk thread on Intel vs. AMD. Includes notes on dual core Opterons which are unexpectedly affordable.
posted by b1tr0t at 1:50 PM on November 20, 2005


« Older I currently use the standard 3...   |   How best to invest $100 every ... Newer »

You are not logged in, either login or create an account to post comments



Related Questions
No more bounce! Damn ye, Mac dock! March 10, 2008
iBudget December 26, 2007
Burn baby! October 18, 2007
Cloning DVDs on a Mac February 21, 2007
Best Motherboard for Intel Core 2 Duo E6300 August 16, 2006