Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-28 22:58:10
Message-ID: CAAaqYe9g-x++U-R4yzMA-caRa0NGaK9YHqKg_=zcR6AyDsTnjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 28, 2020 at 5:30 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On Sat, Mar 28, 2020 at 10:19:04AM -0400, James Coleman wrote:
> >On Fri, Mar 27, 2020 at 10:58 PM Tomas Vondra
> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >>
> >> ...
> >>
> >> The more I look at pathkeys_useful_for_ordering() the more I think the
> >> get_useful_pathkeys_for_relation() function should look more like it
> >> than the postgres_fdw one ...
> >
> >If we go down that path (and indeed this is actually implied by
> >removing the volatile check too), doesn't that really just mean (at
> >least for now) that get_useful_pathkeys_for_relation goes away
> >entirely and we effectively use root->query_pathkeys directly? The
> >only thing your lose them is the check that each eclass has a member
> >in the rel. But that check probably wasn't quite right anyway: at
> >least for incremental sort (maybe not regular sort), I think we
> >probably don't necessarily care that the entire list has members in
> >the rel, but rather that some prefix does, right? The idea there would
> >be that we could create a gather merge here that supplies a partial
> >ordering (that prefix of query_pathkeys) and then above it the planner
> >might place another incremental sort (say above a join), or perhaps
> >even a join above that would actually be sufficient (since many joins
> >are capable of providing an ordering anyway).
> >
> >I've attached (added prefix .txt to avoid the cfbot assuming this is a
> >full series) an idea in that direction to see if we're thinking along
> >the same route. You'll want to apply and view with `git diff -w`
> >probably.
> >
>
> Hmmm, I'm not sure the patch is quite correct.
>
> Firstly, I suggest we don't remove get_useful_pathkeys_for_relation
> entirely, because that allows us to add more useful pathkeys in the
> future (even if we don't consider pathkeys useful for merging now).
> We could also do the EC optimization in the function, return NIL and
> not loop over partial_pathlist at all. That's cosmetic issue, though.
>
> More importantly, I'm not sure this makes sense:
>
> /* consider incremental sort for interesting orderings */
> useful_pathkeys = truncate_useless_pathkeys(root, rel, subpath->pathkeys);
>
> ...
>
> is_sorted = pathkeys_common_contained_in(useful_pathkeys,
> subpath->pathkeys,
> &presorted_keys);
>
> I mean, useful_pathkeys is bound to be a sublist of subpath->pathkeys,
> right? So how could this ever return is_sorted=false?
>
> The whole point is to end up with query_pathkeys (or whatever pathkeys
> we deem useful), but this does not do that.

Yes, that's obviously a thinko in my rush to get an idea out.

I think useful_pathkeys there would be essentially
root->query_pathkeys, or, more correctly the prefix of query_pathkeys
that has eclasses shared with the current rel. Or, if we go with the
more restrictive approach (see the two approaches I mentioned
earlier), then either NIL or query_pathkeys (if they all matches
eclasses in the current rel).

Given those requirements, I agree that keeping
get_useful_pathkeys_for_relation makes sense to wrap up all of that
behavior.

James

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-03-28 22:59:22 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Justin Pryzby 2020-03-28 22:30:52 debian bugrept involving fast default crash in pg11.7