Re: Top-N sorts in EXPLAIN, row count estimates, and parallelism

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Top-N sorts in EXPLAIN, row count estimates, and parallelism
Date: 2019-06-06 00:45:09
Message-ID: CAH2-Wz=4TCsFCmJSqaihBoH6WTdVnXQs0uKJYFky2aQB=KrZfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 23, 2019 at 7:48 PM David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> > cost_sort() costs sorts as top-N heapsorts. While we cannot make an
> > iron-clad guarantee that it will work out that way from within
> > tuplesort.c, that doesn't seem like it closes off the possibility of
> > more informative EXPLAIN output. For example, can't we at report that
> > the tuplesort will be "bounded" within EXPLAIN, indicating that we
> > intend to attempt to sort using a top-N heap sort (i.e. we'll
> > definitely do it that way if there is sufficient work_mem)?
>
> I think this really needs more of a concrete proposal. Remember
> LIMIT/OFFSET don't need to be constants, they could be a Param or some
> return value from a subquery, so the bound might not be known until
> after executor startup, to which EXPLAIN is not going to get to know
> about that.

I was merely pointing out that it is clear when a sort *could* be a
top-n sort, which could be exposed by EXPLAIN without anyone feeling
misled.

> After that, what would we do with it in EXPLAIN? Always show "Bound:
> <n>", if it's not -1?

I'm not sure.

The distinction between a top-n sort and any other sort is an
important one (it's certainly more important than the distinction
between an internal and external sort), so it's worth being flexible
in order to expose more information in EXPLAIN output. I would be
willing to accept some kind of qualified or hedged description in the
EXPLAIN output for a bounded sort node, even though that approach
doesn't seem desirable in general.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2019-06-06 01:19:48 Re: pg_basebackup failure after setting default_table_access_method option
Previous Message David Rowley 2019-06-06 00:41:03 Re: No mention of no CIC support for partitioned index in docs