Skip site navigation (1) Skip section navigation (2)

Re: Large (8M) cache vs. dual-core CPUs

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>, mark(at)mark(dot)mielke(dot)cc,Bill Moran <wmoran(at)collaborativefusion(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: Large (8M) cache vs. dual-core CPUs
Date: 2006-04-26 22:37:31
Message-ID: 20060426223731.GE97354@pervasive.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Wed, Apr 26, 2006 at 06:16:46PM -0400, Bruce Momjian wrote:
> Jim C. Nasby wrote:
> > On Wed, Apr 26, 2006 at 10:27:18AM -0500, Scott Marlowe wrote:
> > > If you haven't actually run a heavy benchmark of postgresql on the two
> > > architectures, please don't make your decision based on other
> > > benchmarks.  Since you've got both a D920 and an X2 3800, that'd be a
> > > great place to start.  Mock up some benchmark with a couple dozen
> > > threads hitting the server at once and see if the Intel can keep up.  It
> > 
> > Or better yet, use dbt* or even pgbench so others can reproduce...
> 
> For why Opterons are superior to Intel for PostgreSQL, see:
> 
> 	http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=2
> 
> Section "MESI-MESI-MOESI Banana-fana...".  Specifically, this part about
> the Intel implementation:
> 
> 	The processor with the Invalid data in its cache (CPU 0, let's say)
> 	might then wish to modify that chunk of data, but it could not do so
> 	while the only valid copy of the data is in the cache of the other
> 	processor (CPU 1). Instead, CPU 0 would have to wait until CPU 1 wrote
> 	the modified data back to main memory before proceeding.and that takes
> 	time, bus bandwidth, and memory bandwidth. This is the great drawback of
> 	MESI.
> 
> AMD transfers the dirty cache line directly from cpu to cpu.  I can
> imaging that helping our test-and-set shared memory usage quite a bit.

Wasn't the whole point of test-and-set that it's the recommended way to
do lightweight spinlocks according to AMD/Intel? You'd think they'd have
a way to make that performant on multiple CPUs (though if it's relying
on possibly modifying an underlying data page I can't really think of
how to do that without snaking through the cache...)
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby(at)pervasive(dot)com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

In response to

Responses

pgsql-performance by date

Next:From: Bruce MomjianDate: 2006-04-26 23:02:34
Subject: Re: [PERFORM] WAL logging of SELECT ... INTO command
Previous:From: Jim C. NasbyDate: 2006-04-26 22:28:56
Subject: Re: Introducing a new linux readahead framework

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group