Re: Large (8M) cache vs. dual-core CPUs

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>, mark(at)mark(dot)mielke(dot)cc, Bill Moran <wmoran(at)collaborativefusion(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large (8M) cache vs. dual-core CPUs
Date: 2006-04-26 22:37:31
Message-ID: 20060426223731.GE97354@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote:
> Jim C. Nasby wrote:
> > On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote:
> > > If you haven't actually run a heavy benchmark of postgresql on the two
> > > architectures, please don't make your decision based on other
> > > benchmarks. Since you've got both a D920 and an X2 3800, that'd be a
> > > great place to start. Mock up some benchmark with a couple dozen
> > > threads hitting the server at once and see if the Intel can keep up. It
> >
> > Or better yet, use dbt* or even pgbench so others can reproduce...
>
> For why Opterons are superior to Intel for PostgreSQL, see:
>
> http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2
>
> Section "MESI-MESI-MOESI Banana-fana...". Specifically, this part about
> the Intel implementation:
>
> The processor with the Invalid data in its cache (CPU 0, let's say)
> might then wish to modify that chunk of data, but it could not do so
> while the only valid copy of the data is in the cache of the other
> processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote
> the modified data back to main memory before proceeding.and that takes
> time, bus bandwidth, and memory bandwidth. This is the great drawback of
> MESI.
>
> AMD transfers the dirty cache line directly from cpu to cpu. I can
> imaging that helping our test-and-set shared memory usage quite a bit.

Wasn't the whole point of test-and-set that it's the recommended way to
do lightweight spinlocks according to AMD/Intel? You'd think they'd have
a way to make that performant on multiple CPUs (though if it's relying
on possibly modifying an underlying data page I can't really think of
how to do that without snaking through the cache...)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2006-04-26 23:02:34 Re: [PERFORM] WAL logging of SELECT ... INTO command
Previous Message Jim C. Nasby 2006-04-26 22:28:56 Re: Introducing a new linux readahead framework