Re: That EXPLAIN ANALYZE patch still needs work

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Larry Rosenman <ler(at)lerctr(dot)org>, "'Alvaro Herrera'" <alvherre(at)commandprompt(dot)com>, "'Simon Riggs'" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: That EXPLAIN ANALYZE patch still needs work
Date: 2006-06-09 15:49:03
Message-ID: 12446.1149868143@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Fri, Jun 09, 2006 at 10:00:20AM -0400, Tom Lane wrote:
>> I had thought we were applying an order-of-ten-percent correction by
>> subtracting SampleOverhead, not an order-of-10x correction :-(

> Eh? The whole point is to call gettimeofday() much less often. If you
> call it 1000th as often, then the correction is only on the order of
> one hundredth of the normal query time...

No, because the correction calculation is
totaltime += (average time per sampled execution - SampleOverhead) * (number of unsampled executions)

If SampleOverhead is 90% of the average time per sampled execution,
then you are multiplying something with a large component of
cancellation error by a possibly-large number.

As an example, using the numbers I posted here for my old PC:
http://archives.postgresql.org/pgsql-hackers/2006-06/msg00407.php
the actual runtime is clearly about 1 usec per tuple but enabling
timing adds 8.5 usec per tuple. If we suppose we sampled
10000 out of 1 million rows, then we'd have

raw totaltime = 10000 * 9.5 usec = 95msec
avg time/execution = 9.5 usec
SampleOverhead = 8.5 usec
number of unsampled executions = 990000
correction = 990msec

which means that a 10% error in estimating SampleOverhead would
contribute as much to the final estimate as the actual measurement did.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message A.M. 2006-06-09 16:01:07 Re: Fabian Pascal and RDBMS deficiencies in fully
Previous Message David Fetter 2006-06-09 15:45:59 Re: Fabian Pascal and RDBMS deficiencies in fully implementing the relational model