Re: Where does the time go?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Where does the time go?
Date: 2006-03-25 17:55:26
Message-ID: 20060325175526.GE1695@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Sat, Mar 25, 2006 at 05:38:26PM +0000, Simon Riggs wrote:
> On Sat, 2006-03-25 at 16:24 +0100, Martijn van Oosterhout wrote:
>
> > I agree. However, if it's the overhead of calling gettimeofday() that
> > slows everything down, perhaps we should tackle that end. For example,
> > have a sampling mode that only times say 5% of the executed nodes.
> >
> > EXPLAIN ANALYZE SAMPLE blah;
>
> I like this idea. Why not do this all the time? I'd say we don't need
> the SAMPLE clause at all, just do this for all EXPLAIN ANALYZEs.

I was wondering about that. But then you may run into wierd results if
a subselect takes a long time for just a few value. But maybe it should
be the default, and have a FULL mode to say you want to measure
everything.

> Something even simpler? First 40 plus 5% random sample after that? I'd
> prefer a random sample so we have the highest level of trust in the
> numbers produced. Otherwise we might accidentally introduce bias from
> systematic effects such as nested loops queries speeding up towards the
> end of their run. (I know we would do that at the start, but we are
> stuck because we don't know the population size ahead of time and we
> know we need a reasonable number of data points).

Well, I was wondering if a fixed percentage was appropriate. 5% of 10
million is still a lot for possibly not a lot of benefit. The followup
email suggested a sampling that keeps happening less often as the
number of tuples increases it a logorithmic based way. But we could add
dome randomness that'd be cool. The question is, what's the overhead of
calling random()?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2006-03-25 18:00:54 Re: A big thank you to all!
Previous Message Peter Eisentraut 2006-03-25 17:50:30 Re: Role incompatibilities

Browse pgsql-patches by date

  From Date Subject
Next Message Pavel Stehule 2006-03-27 19:24:20 proposal - plpgsql: execute using into
Previous Message Simon Riggs 2006-03-25 17:38:26 Re: Where does the time go?