Re: Open Item: Should non-text EXPLAIN always show properties?

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Open Item: Should non-text EXPLAIN always show properties?
Date: 2020-07-23 14:14:54
Message-ID: 20200723141454.GF4286@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 25, 2020 at 08:41:43AM -0400, James Coleman wrote:
> On Thu, Jun 25, 2020 at 5:15 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > Over on [1] Justin mentions that the non-text EXPLAIN ANALYZE should
> > always show the "Disk Usage" and "HashAgg Batches" properties. I
> > agree with this. show_wal_usage() is a good example of how we normally
> > do things. We try to keep the text format as humanly readable as
> > possible but don't really expect humans to be commonly reading the
> > other supported formats, so we care less about including additional
> > details there.
> >
> > There's also an open item regarding this for Incremental Sort, so I've
> > CC'd James and Tomas here. This seems like a good place to discuss
> > both.
>
> Yesterday I'd replied [1] to Justin's proposal for this WRT
> incremental sort and expressed my opinion that including both
> unnecessarily (i.e., including disk when an in-memory sort was used)
> is undesirable and confusing and leads to shortcuts I believe to be
> bad habits when using the data programmatically.

I have gone back and forth about this.

The current non-text output for Incremental Sort is like:

Sort Methods Used: +
- "quicksort" +
Sort Space Memory: +
Average Sort Space Used: 26 +
Peak Sort Space Used: 26 +

explain.c determines whether to output in non-text mode by checking:
| if (groupInfo->maxDiskSpaceUsed > 0)

Which I think is per se wrong. Either it should use a test like:
| if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT != 0)
or it should output the "Sort Space" unconditionally.

It does seem wrong if Incr Sort says "Sort Space Disk / Average: 0, Peak: 0"
when there was no disk sort at all, and it wasn't listed as a "Sort Method".

On the other hand, that's determined during execution, right? (Based on things
like table content and table order and tuple width) ? David's argument in
making the HashAgg's explain output unconditionally show Disk/Batch was that
this is not known until execution time (based on table content).

HashAgg now shows:

SET work_mem='64 MB'; explain(format yaml, analyze) SELECT a ,COUNT(1) FROM generate_series(1,99999)a GROUP BY 1;
...
Disk Usage: 0 +
HashAgg Batches: 0 +

So I think I still think incr sort should do the same, showing Disk:0.

Otherwise, I think it should at least use a test like this, rather than (DiskSpaceUsed > 0):
| if (groupInfo->sortMethods & SORT_TYPE_QUICKSORT != 0)

--
Justin

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Förster 2020-07-23 14:16:23 Re: Building 12.3 from source on Mac
Previous Message Tom Lane 2020-07-23 14:03:48 Re: Building 12.3 from source on Mac