Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-29 14:21:06
Message-ID: CAAaqYe-M==BwbC47LeMM2HcUGHqCdynNMMyncrBa7t0AGzZGWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 28, 2020 at 11:23 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Sat, Mar 28, 2020 at 11:14 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >
> > On Sat, Mar 28, 2020 at 10:47:49PM -0400, James Coleman wrote:
> > >On Sat, Mar 28, 2020 at 6:59 PM Tomas Vondra
> > ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > >>
> > >> Hi,
> > >>
> > >> Attached is my take on simplification of the useful pathkeyes thing. It
> > >> keeps the function, but it truncates query_pathkeys to only members with
> > >> EC members in the relation. I think that's essentially the optimization
> > >> you've proposed.
> > >
> > >Thanks. I've included that in the patch series in this email (as a
> > >separate patch) with a few additional comments.
> > >
> >
> > Thanks.
> >
> > >I've also noticed that the enabled_incrementalsort check in
> > >generate_useful_gather_paths seemed broken, because it returned us out
> > >of the function before creating either a plain gather merge (if
> > >already sorted) or an explicit sort path. I've included a patch that
> > >moves it to the if block that actually builds the incremental sort
> > >path.
> > >
> >
> > Hmmm, that's probably right.
> >
> > The reason why the GUC check was right after generate_gather_paths is
> > that the intent to disable all the useful-pathkeys business, essentially
> > reducing it back to plain generate_gather_paths.
> >
> > But I think you're right that's wrong, because it might lead to strange
> > behavior when the GUC switches between plans without any incremental
> > sort nodes - setting it to 'true' might end up picking GM on top of
> > plain Sort, for example.
>
> Thanks.
>
> > >>
> > >> ...
> > >>
> > >> 1) Missing worker identification (Worker #).
> > >
> > >Fixed.
> > >
> > >> 2) Missing method for workers (we have it for the leader, though).
> > >
> > >Fixed. Since we can't have pointers in the parallel shared memory
> > >space, we can't store the sort methods used in a list. To accomplish
> > >the same goal, I've assigned the TuplesortMethod enum entries uique
> > >bit positions, and store the methods used in a bitmask.
> > >
> >
> > OK, makes sense.
> >
> > >> 3) I'm not sure why the lable is "Methods" instead of "Sort Method", and
> > >> why it's in parenthesis.
> > >
> > >I've removed the parentheses. It's labeled "Methods" since there can
> > >be more than one (different batches could use different methods). I've
> > >updated this to properly use singular/plural depending on the number
> > >of methods used.
> > >
> >
> > I'm a bit confused. How do I know which batch used which method? Is it
> > actually worth printing in explain analyze? Maybe only print it in the
> > verbose mode?
>
> The alternative is showing no sort method information at all, or only
> showing it if all batches used the same method (which seems confusing
> to me). It seems weird that we wouldn't try to find some rough
> analogue to what a regular sort node outputs, so I've attempted to
> summarize.
>
> This is similar to the memory information: the average doesn't apply
> to any one batch, and you don't know which one (or how many) hit the
> peak memory usage either, but I think it's meaningful to know a
> summary.
>
> With the sort methods, I think it's useful to be able to, for example,
> know if any of the groups happened to trigger the top-n heapsort
> optimization, or not, and as a corollary, if all of them did or not.
>
> > >> 4) Not sure having two lines for each worker is a great idea.
> > >
> > >I've left these in for now because the lines are already very long
> > >(much, much longer than the worker lines in a standard sort node).
> > >This is largely because we're trying to summarize many sort batches,
> > >while standard sort nodes only have to give the exact stats from a
> > >single batch.
> > >
> > >See the example output later in the email.
> > >
> >
> > OK
> >
> > >> 5) I'd probably prefer having multiple labels for avg/max memory values,
> > >> instead of (avg) and (max) notes. Also, I think we use "peak" in this
> > >> context instead of "max".
> > >
> > >Updated.
> > >
> >
> > OK
> >
> > >Here's the current output:
> > >
> > > Limit (cost=1887419.20..1889547.68 rows=10000 width=8) (actual
> > >time=13218.403..13222.519 rows=10000 loo
> > >ps=1)
> > > -> Gather Merge (cost=1887419.20..19624748.03 rows=83333360
> > >width=8) (actual time=13218.401..13229.7
> > >50 rows=10000 loops=1)
> > > Workers Planned: 2
> > > Workers Launched: 2
> > > -> Incremental Sort (cost=1886419.17..10005010.55
> > >rows=41666680 width=8) (actual time=13208.00
> > >4..13208.586 rows=4425 loops=3)
> > > Sort Key: a, b
> > > Presorted Key: a
> > > Full-sort Groups: 1 Sort Method: quicksort Memory:
> > >avg=28kB peak=28kB
> > > Presorted Groups: 1 Sort Method: top-N heapsort Memory:
> > >avg=1681kB peak=1681kB
> > > Worker 0: Full-sort Groups: 1 Sort Method: quicksort
> > >Memory: avg=28kB peak=28kB
> > > Presorted Groups: 1 Sort Method: top-N heapsort
> > >Memory: avg=1680kB peak=1680kB
> > > Worker 1: Full-sort Groups: 1 Sort Method: quicksort
> > >Memory: avg=28kB peak=28kB
> > > Presorted Groups: 1 Sort Method: top-N heapsort
> > >Memory: avg=1682kB peak=1682kB
> > > -> Parallel Index Scan using index_s_a on s
> > >(cost=0.57..4967182.06 rows=41666680 width=8
> > >) (actual time=0.455..11730.878 rows=6666668 loops=3)
> >
> > Looks reasonable. Did you try it in other output formats - json/yaml?
>
> I did. JSON looks good also, which implies to me yaml would to (but I
> didn't look at it).

After sleeping on it, I decided the XXX I'd left in my explain fixes
was too gross to keep around, so I've replaced it with a macro that
selects the proper shared or private memory group info struct (that
way we avoid the pointer comparison hack to reverse engineer (in the
parallel worker case) which group info we were looking to use.

James

Attachment Content-Type Size
v45-0004-A-couple-more-places-for-incremental-sort.patch text/x-patch 9.2 KB
v45-0005-rework-of-get_useful_pathkeys_for_relation.patch text/x-patch 2.7 KB
v45-0001-Consider-low-startup-cost-when-adding-partial-pa.patch text/x-patch 4.6 KB
v45-0002-Implement-incremental-sort.patch text/x-patch 164.5 KB
v45-0003-Consider-incremental-sort-paths-in-additional-pl.patch text/x-patch 15.7 KB
v45-0006-fix-inc-sort-enabled-check.patch text/x-patch 1.3 KB
v45-0007-explain-fixes.patch text/x-patch 12.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matheus de Oliveira 2020-03-29 14:39:25 Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT
Previous Message Juan José Santamaría Flecha 2020-03-29 12:55:37 Re: color by default