Re: Xeon twice the performance of opteron

From: "mark" <dvlhntr(at)gmail(dot)com>
To: "'Jeff'" <threshar(at)torgo(dot)978(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Cc: "'Brian Ristuccia'" <brian(at)ristuccia(dot)com>
Subject: Re: Xeon twice the performance of opteron
Date: 2011-03-18 01:24:09
Message-ID: 005e01cbe50b$2f6fad00$8e4f0700$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-performance-
> owner(at)postgresql(dot)org] On Behalf Of Jeff
> Sent: Thursday, March 17, 2011 9:14 AM
> To: pgsql-performance(at)postgresql(dot)org
> Cc: Brian Ristuccia
> Subject: [PERFORM] Xeon twice the performance of opteron
>
> hey folks,
>
> Running into some odd performance issues between a few of our db
> boxes. While trying to speed up a query I ran it on another box and
> it was twice as fast. The plans are identical and various portions of
> the query run in the same amount of time - it all boils down to most
> of the time being spent in a join filter. The plan is as good as it
> is going to get but the thing that is concerning me, which hopefully
> some folks here may have some insight on, is the very large difference
> in runtime.
>
> three boxes:
> A: Intel(R) Xeon(R) CPU E5345 @ 2.33GHz (Runs query
> fastest)
> 4MB cache
> B: Quad-Core AMD Opteron(tm) Processor 2352 (2.1GHZ) (Main
> production
> box, currently, middle speed)
> 512k cache
> C: Quad-Core AMD Opteron(tm) Processor 2378 (2.4GHZ)
> 512k cache
>
> A & B are running PG 8.4.2 (yes, I know it desperately need to be
> upgraded). C was also on 8.4.2 and since it was not in production I
> upgraded it to 8.4.7 and got the same performance as 8.4.2. Dataset
> on A & B is the same C is mostly the same, but is missing a couple
> weeks of data (but since this query runs over 3 years of data, it is
> negligable - plus C runs the slowest!)
>
> All three running FC10 with kernel Linux db06
> 2.6.27.19-170.2.35.fc10.x86_64 #1 SMP Mon Feb 23 13:00:23 EST 2009
> x86_64 x86_64 x86_64 GNU/Linux
>
> Load is very low on each box. The query is running from shared_buffers
> - no real IO is occuring.
>
> The average timing for the query in question is 90ms on A, 180ms on B
> and 190ms on C.
>
> Now here's where some odd stuff starts piling up: explain analyze
> overhead on said queries:
> 20ms on A, 50ms on B and 85ms on C(!!)
>
> We had one thought about potential NUMA issues, but doing a series
> (100) of connect, query, disconnect and looking at the timings reveals
> them all to be solid... but even still we wouldn't expect it to be
> that awful. The smaller cache of the opterons is also a valid
> argument.
>
> I know we're running an old kernel, I'm tempted to upgrade to see what
> will happen, but at the same time I'm afraid it'll upgrade to a kernel
> with a broken [insert major subsystem here] which has happened before.
>
> Anybody have some insight into this or run into this before?
>
> btw, little more background on the query:
>
> -> Nested Loop (cost=5.87..2763.69 rows=9943 width=0) (actual
> time=0.571..2
> 74.750 rows=766 loops=1)
> Join Filter: (ce.eventdate >= (md.date - '6 days'::interval))
> -> Nested Loop (cost=5.87..1717.98 rows=27 width=8)
> (actual time=0.53
> 3..8.301 rows=159 loops=1)
> [stuff removed here]
> -> Index Scan using xxxxxxx_date_idx on xxxxxx md
> (cost=0.00..19.50 rows=1099 width=8) (actual time=0.023..0.729
> rows=951 loops=15
> 9)
> Index Cond: (ce.eventdate <= md.date)
>
>
> On all three boxes that inner nestloop completes in about the same
> amount of time - it is that join filter that is causing the pain and
> agony. (If you are noticing the timing differences, that is because
> the numbers above are the actual numbers, not explain analyze). The
> query is pulling up a rolling window of events that occured on a
> specific date. This query pulls up al the data for a period of time.
> ce.eventdate is indexed, and is used in the outer nestloop. Thinking
> more about what is going on cache thrashing is certainly a possibility.
>
> the amazing explain analyze overhead is also very curious - we all
> know it adds overhead, but 85ms? Yow.
>
> --
> Jeff Trout <jeff(at)jefftrout(dot)com>
> http://www.stuarthamm.net/
> http://www.dellsmartexitin.com/

I am sure you might have already checked for this, but just incase...
Did you verify that no power savings stuff is turned on in the BIOS or at
the kernel ?

I have to set ours to something HP calls static high performance or
something like that if I want boxes that are normally pretty idle to execute
in a predictable fashion for sub second queries.

I assume you checked with a steam benchmark results on the AMD machines to
make sure they are getting in the ballpark of where they are supposed to ?

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2011-03-18 01:39:11 Re: Xeon twice the performance of opteron
Previous Message Andy Colson 2011-03-18 01:08:05 Re: Fastest pq_restore?