Re: [COMMITTERS] pgsql: Augment EXPLAIN output with more details on Hash nodes.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Augment EXPLAIN output with more details on Hash nodes.
Date: 2010-02-01 17:28:28
Message-ID: 603c8f071002010928r4dfc0e14oc4a420d2d881b80@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Mon, Feb 1, 2010 at 12:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Feb 1, 2010 at 11:53 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> This needs to be damped down a bit.  It should not print useless
>>> non-information in cases where the plan wasn't actually run.
>>> Please compare show_sort_info.
>
>> Eh?  When does it do that?
>
> Oh, I'm sorry, it's using hashtable existence to condition the whole
> output.  So my complaint is backwards.  I thought the intention was
> to print the estimated number of batches in all cases, and then the
> actual as well in EXPLAIN ANALYZE.
>
> BTW, I think "estimated" and "actual" would be less confusing
> terminology than "original".

I think (but am not 100% sure) that the number that is computed during
the plan phase is acually thrown away and recomputed during the
execution phase (grep for ExecChooseHashTableSize). So potentially
there is:

A. the number of buckets and batches estimated during planning
B. the number of buckets and batches decided on at the beginning of execution
C. the number of batches we end up using as a result of work_mem overflow

Right now I'm just printing out B and C; we could add A as well, but I
think there are some changes needed to hold on to that information for
longer than we presently do. At any rate, the terminology we settle
on should be able to accommodate potentially dumping out all of these
values.

IMO, it's not worth spending an enormous amount of time on this. The
most important questions that I think people will want to answer are
(1) was this done as a multi-batch hash join?, (b) if so, did it start
out that way or did nbatch increase on the fly?, and (3) how close was
I to overflowing work_mem? I'm happy to make improvements, but I
don't think we should get too crazy.

...Robert

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2010-02-01 19:28:56 pgsql: Tighten integrity checks on ALTER TABLE ...
Previous Message Tom Lane 2010-02-01 17:12:09 Re: pgsql: Augment EXPLAIN output with more details on Hash nodes.

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2010-02-01 17:30:02 Re: BUG #5304: psql using conninfo fails in connecting to the server
Previous Message Tom Lane 2010-02-01 17:12:09 Re: pgsql: Augment EXPLAIN output with more details on Hash nodes.