Re: Some belated patch review for "Buffers" explain analyze patch

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Some belated patch review for "Buffers" explain analyze patch
Date: 2010-02-10 00:18:42
Message-ID: 603c8f071002091618y21a4a37o7374a63167c320f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 9, 2010 at 6:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> A more important point is that it would be a nontrivial change, both as
>>> to code and documentation, and it's too late for such in 9.0.  So what
>>> we ought to be confining the discussion to right now is what 9.0 should
>>> print here.
>
>> It's exactly as nontrivial as the proposed change in the other direction.
>
> Not in the least.  Fixing EXPLAIN to consistently print totals would
> involve changes in (at least) the treatment of estimated costs, and very
> possibly some changes in the Instrumentation support as well.

As far as I am aware there is one place (in ExplainNode) where all the
division happens for the regular estimates, and one place in that same
function that would need to be changed for EXPLAIN BUFFERS. On a
quick look, I see no reason why the Instrumentation support would need
any modification at all.

> I notice
> you blithely disregarded the documentation point, too.

Very blithely. The current behavior of dividing the estimate by the
row count and rounding off in a way that makes it impossible to
reconstruct the raw numbers is equally undocumented. It seems to me
that the documentation will require some updating no matter what we
decide to do.

It seems to me that the entire topic of this thread is taking some
numbers that are simple and useful and trying as hard as possible to
masticate them in a way that will make them misleading and difficult
to use. As I understand it, the proposal on the table is that if we
have a node that over 5,326 iterations hits 31,529 shared buffers and
reads 2135 shared buffers, then instead of printing:

Buffers: shared hit=31529 read=2135

...we're instead going to print:

Buffers: shared hit=47kB read=3kB

Explain to me why we think that's an improvement?

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-02-10 00:19:56 Re: Writeable CTEs and empty relations
Previous Message Andrew Chernow 2010-02-10 00:02:31 Re: Listen / Notify - what to do when the queue is full