Re: Can the V7.3 EXPLAIN ANALYZE be trusted?

From: Steven Rosenstein <srosenst(at)us(dot)ibm(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Can the V7.3 EXPLAIN ANALYZE be trusted?
Date: 2005-02-07 02:43:09
Message-ID: OF5755F2EF.96F7B2BD-ON85256FA1.000E04A3-85256FA1.000EEFF9@us.ibm.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You're probably right about my being overly optimistic about the load
imposed by EXPLAIN ANALYZE. It was just that in my previous experience
with it, I'd never seen such a large runtime discrepancy before. I even
allowed for a "caching effect" by making sure the server was all but
quiescent, and then running the three queries as quickly after one another
as I could.

The server itself is an IBM x345 with dual Xeon 3ghz CPU's (hyperthreading
turned off) and 2.5gb of RAM. O/S is RHEL3 Update 4. Disks are a
ServeRAID of some flavor, I'm not sure what.

Thanks for the heads-up about the performance of IN in 7.3. We're looking
to migrate to 8.0 or 8.0.1 when they become GA, but some of our databases
are in excess of 200gb-300gb, and we need to make sure we have a good
migration plan in place (space to store the dump out of the 7.3 db) before
we start.
___________________________________________________________________________________

Steven Rosenstein
IT Architect/Developer | IBM Virtual Server Administration
Voice/FAX: 845-689-2064 | Cell: 646-345-6978 | Tieline: 930-6001
Text Messaging: 6463456978 @ mobile.mycingular.com
Email: srosenst @ us.ibm.com

"Learn from the mistakes of others because you can't live long enough to
make them all yourself." -- Eleanor Roosevelt


Tom Lane
<tgl(at)sss(dot)pgh(dot)pa(dot)u
s> To
Steven Rosenstein/New
02/06/2005 05:46 York/IBM(at)IBMUS
PM cc
pgsql-performance(at)postgresql(dot)org
Subject
Re: [PERFORM] Can the V7.3 EXPLAIN
ANALYZE be trusted?





> From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org]On Behalf Of Steven
Rosenstein
> >> I don't think EXPLAIN ANALYZE puts that much overhead on a query.

I think you're being overly optimistic. The explain shows that the
Materialize subnode is being entered upwards of 32 million times:

-> Materialize (cost=505.06..511.38 rows=632 width=4) (actual
time=0.00..0.02 rows=43 loops=752066)

43 * 752066 = 32338838. The instrumentation overhead is basically two
gettimeofday() kernel calls per node entry. Doing the math shows that
your machine is able to do gettimeofday() in about half a microsecond,
which isn't stellar but it's not all that slow for a kernel call.
(What's the platform here, anyway?) Nonetheless it's a couple of times
larger than the actual time needed to pull a row from a materialized
array ...

The real answer to your question is "IN (subselect) sucks before PG 7.4;
get a newer release".

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Antony Paul 2005-02-07 09:07:15 Index not used with or condition
Previous Message Jan Wieck 2005-02-06 23:06:03 Re: PostgreSQL clustering VS MySQL clustering