Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused
Date: 2018-09-13 14:04:10
Message-ID: 87a7olo9wb.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> I'm also a bit suspicious as to whether the code is even correct;
Tom> does it *really* match what will happen later when we create sort
Tom> plan nodes? (Maybe it's fine; I haven't looked.)

As things stand before the patch, the code that actually generates the
path of sort+window nodes requires only this assumption: that
order-equivalent windows (as defined by the spec) must end up together
in the list, or otherwise separated only by entries that don't add a new
sort node.

(aside: I itch to rewrite the comment that says "the spec requires that
there be only one sort" - number of sorts is an implementation detail
about which the spec is silent, what it _actually_ requires is that peer
rows must be presented in the same order in all order-equivalent
windows, which we choose to implement by ensuring there is only one sort
for such windows, rather than, for example, adding extra sort keys to
provide stability.)

The path-generation code simply concatenates the partition and order
lists and creates pathkeys. The pathkeys creation removes redundant
entries. So if we're guaranteed that two entries considered equal by the
patch code are also considered equal by the pathkey mechanism, which I
believe is the case, then the logic is still correct (enough to satisfy
the spec and produce correct query results).

There are optimizations that can be done once we have the pathkeys that
can't be anticipated by select_active_windows because that function is
run before we set up equivalence classes. This might lead path creation
to produce fewer sorts than anticipated, but not more sorts.

So I'm satisfied, as far as I can tell, that the logic is both correct
and an improvement over what we currently have.

(Perhaps worth noting for future work is that this code and the grouping
sets code have a common issue: currently we allow only one sort order to
be requested as query_pathkeys, but this means that both window paths
and grouping sets paths have to make an essentially arbitrary choice of
query_pathkeys, rather than having a set of possible "useful" orderings
and taking whichever can be produced most cheaply.)

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-13 14:13:41 Re: Getting ERROR: could not open file "base/13164/t3_16388" with partition table with ON COMMIT
Previous Message Tom Lane 2018-09-13 13:56:02 Re: pg_dump test instability