From: Gilboa Davara (gilboada_at_nonexisting.hamakor.org.il)
Date: Sun 07 Aug 2005 - 17:24:59 IDT
Umm... it's a bit messy.
When you have more then two cores (be that dual core or dual socket),
you add a couple of problems namely:
* Atomic operations. (#ASSERT LOCK / bt/bts/btr/etc, prevent two CPUs
for accessing the same resource)
* Cache coherency protocols. (CPU A updates a memory block that's cached
inside CPU B, need to notify CPU B of the change.)
These two operation add significant latency and eat a lot of bus
bandwidth.
AMD's dual core implementation (unlike the Intel one, BTW), does this
using a special fast SRQ (System request bus) instead of using the
normal 800Mhz/1Ghz Hyper-transport bus. This means that the inter-core
communication between the two cores eats less resource on an Opteron
275.
However, a dual Opteron has the following advantages.
* Twice the memory bandwidth. A dual Opteron has two DIMM banks (which,
when used with NUMA aware OS, gives you twice the memory bandwidth).
* Higher MHz. Each x50 core runs at 2.4Ghz while each x75 core runs at
2.2Ghz.
In short, it'll be fairly easy to create a test case where a dual
Opteron 250 outperforms a single dual core 275.
BTW,
If you need a single CPU, dual core machine, you can save a lot of money
by buying a dual core Opteron 175 CPU (instead of the dual CPU capable
275 CPU), plus, a single socket 940 boards usually cost half the price
of a dual socket ones.
Better yet, If you don't need > 2GB memory (no need for registered
memory) you can further save money by buying an dual core Athlon 64 X2
4800 (Socket 939) with a much cheaper socket 939 board and non
registered memory. (Athlon 64 X2 CPU starts at ~2300nis; A 939 board
cots less then 1000nis and non registered memory is pretty cheap these
days)
Gilboa
On Sun, 2005-08-07 at 16:50 +0300, Michael Ben-Nes wrote:
> Read somewhere today ( wish i remembered where ) that if the budget is
> limited and the option are:
>
>
> Dual Opteron 250 CPU
>
> OR
>
> One CPU 275 ( dual core )
>
>
> You should go on the single 275.
>
>
> one of the reasons was the balancing between two real cpus slow the
> performance.
>
>
> Im wondering if that true.
>
>
>
> Gilboa Davara wrote:
>
> > Shachar,
> >
> > There's no single answer to your question; in-order to give you better
> > answer I'll need some further information about your software.
> > Here's a couple of points that you might find interesting: (I mostly
> > do kernel-level network streaming/filtering work, so YMMV)
> >
> > * The AMD Opteron *is* the King of the Hill. I found that "my" HP 385
> > (Opteron 248/250) and older IBM e326 (Opteron 246/248) to be able to
> > outperform a similarly configured (and priced) Xeon 2.8/3.4/3.6 (DL
> > 380, IBM e345) hands down. Highly memory and I/O intensive
> > applications like my own (which spends days btree searching and
> > memXXX-ing itself to death) seem to *greatly* favor the Opteron's
> > on-die memory controller. (compared to the Xeon's traditional
> > north-bridge design). I'm still looking for ways to use the Opteron
> > NUMA support; I *assume* that xxx_alloc_node will further improve
> > performance.
> >
> > * The dual core option is a true winner. Even the relatively cheap
> > (?!?!) Opteron 265 machine can run circles around a quad Xeon MP
> > machine. (Shared bus designed never really favored > 2 CPU
> > configuration.) At less then 1000$ per 265 CPU, building a dual - dual
> > core workstation / server is pretty inexpensive. (I plan on upgrading
> > my private dual Opteron workstation to dual core once I find someone
> > that's willing to buy my left kidney...)
> >
> > * The GCC's x86-64 AMD64 optimization favor the Opteron greatly. Only
> > when we optimized our code with -march=nocona we managed to level the
> > playing field a *bit*. Somehow Intel seem to have skimped a little
> > when it they duplicated the AMD64 (s/EM64T/AMD64/g)
> > As far as I remember the Debian AMD64 port is using -march=nocona to
> > help the Xeon save face. (Same goes for my FC4/x86-64 machines)
> >
> > * The Xeon might close the gap if you have highly hyper-theadable code
> > (little or no I/O [including memory I/O] with a lot of integer
> > calculations). In such a (remote?) case, you might actually see a
> > 10-15% gain per socket, maybe even slightly outperforming the Opteron.
> > However, if you plan on using more then two sockets (dual), a shared
> > 400/533Mhz bus doesn't play nice with Hyper-threading enabled. In
> > general I'd stir clear of Hyperthreading on dual - or -above machines.
> >
> > * Might sound weird... but while working on my previous project we saw
> > instances where an older 2.8Ghz 533Mhz (Prestonia?) Xeon was able to
> > outperform the 3.0Ghz 800Mhz Nocona Xeons. Go figure.
> >
> > * The Itanium (1.4Ghz, Medison core?) has lousy Integer performance
> > and memory performance. Don't touch it. (Or you'll burn... literally...)
> >
> > In general I find the Opteron to be the superior platform. But again,
> > we conducted out tests with our software, so YMMV (greatly).
> >
> > Hope it helps,
> > Gilboa
> >
> > On Sun, 2005-08-07 at 14:52 +0300, Shachar Shemesh wrote:
> >
> >>Hi all,
> >>
> >>I'm looking into buying a computation server for a client. They are
> >>looking for the platform that will give them optimal INTEGER
> >>performance. I'm thinking between the 64Bits - PowerPC, Itanium and the
> >>EMT64/AMD64 technologies. I am also interested in more specific
> >>knowledge ("Xeon is better than Athelon" etc.).
> >>
> >>Thoughts? Ideas?
> >>
> >>Any solution picked will be running Debian Linux (Sarge), and the
> >>program will likely be compiled with gcc (whatever version will work best).
> >>
> >>Thanks,
> >>
> >> Shachar
> >>
> >>
> >
=================================================================
To unsubscribe, send mail to linux-il-request_at_linux.org.il with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail linux-il-request_at_linux.org.il
This archive was generated by hypermail 2.1.7 : Sun 07 Aug 2005 - 17:38:11 IDT