Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] Improve EXPLAIN ANALYZE overhead by sampling
Date: 2006-05-30 14:01:49
Message-ID: 200605301401.k4UE1nO08251@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Patch applied. Thanks.

---------------------------------------------------------------------------

Martijn van Oosterhout wrote:
-- Start of PGP signed section.
> This was a suggestion made back in March that would dramatically reduce
> the overhead of EXPLAIN ANALYZE on queries that loop continuously over
> the same nodes.
>
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg01114.php
>
> What it does behave normally for the first 50 tuples of any node, but
> after that it starts sampling at ever increasing intervals, the
> intervals controlled by an exponential function. So for a node
> iterating over 1 million tuples it takes around 15,000 samples. The
> result is that EXPLAIN ANALYZE has a much reduced effect on the total
> execution time.
>
> Without EXPLAIN ANALYZE:
>
> postgres=# select count(*) from generate_series(1,1000000);
> count
> ---------
> 1000000
> (1 row)
>
> Time: 2303.599 ms
>
> EXPLAIN ANALYZE without patch:
>
> postgres=# explain analyze select count(*) from generate_series(1,1000000);
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=15.00..15.01 rows=1 width=0) (actual time=8022.070..8022.073 rows=1 loops=1)
> -> Function Scan on generate_series (cost=0.00..12.50 rows=1000 width=0) (actual time=1381.762..4873.369 rows=1000000 loops=1)
> Total runtime: 8042.472 ms
> (3 rows)
>
> Time: 8043.401 ms
>
> EXPLAIN ANALYZE with patch:
>
> postgres=# explain analyze select count(*) from generate_series(1,1000000);
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=15.00..15.01 rows=1 width=0) (actual time=2469.491..2469.494 rows=1 loops=1)
> -> Function Scan on generate_series (cost=0.00..12.50 rows=1000 width=0) (actual time=1405.002..2187.417 rows=1000000 loops=1)
> Total runtime: 2496.529 ms
> (3 rows)
>
> Time: 2497.488 ms
>
> As you can see, the overhead goes from 5.7 seconds to 0.2 seconds.
> Obviously this is an extreme case, but it will probably help in a lot
> of other cases people have been complaining about.
>
> - To get this close it needs to get an estimate of the sampling overhead.
> It does this by a little calibration loop that is run once per backend.
> If you don't do this, you end up assuming all tuples take the same time
> as tuples with the overhead, resulting in nodes apparently taking
> longer than their parent nodes. Incidently, I measured the overhead to
> be about 3.6us per tuple per node on my (admittedly slightly old)
> machine.
>
> Note that the resulting times still include the overhead actually
> incurred, I didn't filter it out. I want the times to remain reflecting
> reality as closely as possible.
>
> - I also removed InstrStopNodeMulti and made InstrStopNode take a tuple
> count parameter instead. This is much clearer all round.
>
> - I also didn't make it optional. I'm unsure about whether it should be
> optional or not, given the number of cases where it will make a
> difference to be very few.
>
> - The tuple counter for sampling restarts every loop. Thus a node that is
> called repeatedly only returning one value each time will still measure
> every tuple, though its parent node won't. We'll need some field
> testing to see if that remains a significant effect.
>
> - I don't let the user know anywhere how many samples it took. Is this
> something users should care about?
>
> Any comments?
> --
> Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> > From each according to his ability. To each according to his ability to litigate.

[ Attachment, skipping... ]
-- End of PGP section, PGP failed!

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-05-30 14:08:03 Re: plperl's ppport.h out of date?
Previous Message Tom Lane 2006-05-30 13:45:57 Re: [GENERAL] 8.1.4 - problem with PITR - .backup.done / backup.ready version of the same file at the same time.

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-05-30 14:08:18 Re: Proposed doc-patch: Identifying the Current WAL file
Previous Message Bruce Momjian 2006-05-30 13:51:13 Re: [PATCH] Warning about configure args (weaker version)