Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-30 22:53:47
Message-ID: CAAaqYe8hK3FpBRkAtTn1L_sETMUivoR=knoLf5by6Q7yLZnzTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 30, 2020 at 8:24 AM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On Sun, Mar 29, 2020 at 10:16:53PM -0400, James Coleman wrote:
> >On Sun, Mar 29, 2020 at 9:44 PM Tomas Vondra
> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >>
> >> Hi,
> >>
> >> Attached is a slightly reorganized patch series. I've merged the fixes
> >> into the appropriate matches, and I've also combined the two patches
> >> adding incremental sort paths to additional places in planner.
> >>
> >> A couple more comments:
> >>
> >>
> >> 1) I think the GUC documentation in src/sgml/config.sgml is a bit too
> >> detailed, compared to the other enable_* GUCs. I wonder if there's a
> >> better place where to move the details. What about adding some examples
> >> and explanation to perform.sgml?
> >
> >I'll take a look at that and include in a patch series tomorrow.

Attached.

> >> 2) Looking at the explain output, the verbose mode looks like this:
> >>
> >> test=# explain (verbose, analyze) select a from t order by a, b, c;
> >> QUERY PLAN
> >> --------------------------------------------------------------------------------------------------------------------------------------------------------------
> >> Gather Merge (cost=66.31..816072.71 rows=8333226 width=24) (actual time=4.787..20092.555 rows=10000000 loops=1)
> >> Output: a, b, c
> >> Workers Planned: 2
> >> Workers Launched: 2
> >> -> Incremental Sort (cost=66.28..729200.36 rows=4166613 width=24) (actual time=1.308..14021.575 rows=3333333 loops=3)
> >> Output: a, b, c
> >> Sort Key: t.a, t.b, t.c
> >> Presorted Key: t.a, t.b
> >> Full-sort Groups: 4169 Sort Method: quicksort Memory: avg=30kB peak=30kB
> >> Presorted Groups: 4144 Sort Method: quicksort Memory: avg=128kB peak=138kB
> >> Worker 0: actual time=0.766..16122.368 rows=3841573 loops=1
> >> Full-sort Groups: 6871 Sort Method: quicksort Memory: avg=30kB peak=30kB
> >> Presorted Groups: 6823 Sort Method: quicksort Memory: avg=132kB peak=141kB
> >> Worker 1: actual time=1.986..16189.831 rows=3845490 loops=1
> >> Full-sort Groups: 6874 Sort Method: quicksort Memory: avg=30kB peak=30kB
> >> Presorted Groups: 6847 Sort Method: quicksort Memory: avg=130kB peak=139kB
> >> -> Parallel Index Scan using t_a_b_idx on public.t (cost=0.43..382365.92 rows=4166613 width=24) (actual time=0.040..9808.449 rows=3333333 loops=3)
> >> Output: a, b, c
> >> Worker 0: actual time=0.048..11275.178 rows=3841573 loops=1
> >> Worker 1: actual time=0.041..11314.133 rows=3845490 loops=1
> >> Planning Time: 0.166 ms
> >> Execution Time: 25135.029 ms
> >> (22 rows)
> >>
> >> There seems to be missing indentation for the first line of worker info.
> >
> >Working on that too.

See attached. I've folded in the original "explain fixes" patch into
the main series, and the "explain fixes" patch in this series contains
only the changes for the above.

> >> I'm still not quite convinced we should be printing two lines - I know
> >> you mentioned the lines might be too long, but see how long the other
> >> lines may get ...
> >
> >All right, I give in :)
> >
> >Do you think non-workers (both the leader and non-parallel plans)
> >should also move to one line?
> >
>
> I think we should use the same formatting for both cases, so yes.
>
> FWIW I forgot to mention I tweaked the INSTRUMENT_SORT_GROUP macro a
> bit, by moving the if condition in it. That makes the calls easier.

Ah, that actually fixed some of the compile warnings. The other is
fixed in my explain fixes patch.

> >> 3) I see the new nodes (plan state, ...) have "presortedCols" which does
> >> not indicate it's a "number of". I think we usually prefix names of such
> >> fields "n" or "num". What about "nPresortedCols"? (Nitpicking, I know.)
> >
> >I can fix this too.

Changed everywhere we used this var name. I'm tempted to change to
nPresortedKeys, but a cursory glance suggests some cases might
actually be consistent with other var names reference columns, so I'm
not sure if we want to go down that path (and change more than just
this).

> >
> >Also I noticed a few compiler warnings I'll fixup in tomorrow's reply.
> >
>
> OK

Mentioned above.

Thanks,
James

Attachment Content-Type Size
v47-0001-Consider-low-startup-cost-when-adding-partial-pa.patch text/x-patch 4.6 KB
v47-0004-docs-updates.patch text/x-patch 4.0 KB
v47-0003-explain-fixes.patch text/x-patch 4.3 KB
v47-0005-Consider-incremental-sort-paths-in-additional-pl.patch text/x-patch 24.2 KB
v47-0002-Implement-incremental-sort.patch text/x-patch 165.2 KB
v47-0006-nPresortedCols.patch text/x-patch 9.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-03-30 22:56:58 Re: backup manifests
Previous Message Nikita Glukhov 2020-03-30 22:47:05 Re: fix for BUG #3720: wrong results at using ltree