Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-04-06 23:27:19
Message-ID: CAAaqYe9ZCNrF8Q-Ouy1Bg4DPbb2+7bBLH=OAdOQYAkgkQi0+=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 6, 2020 at 7:09 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Mon, Apr 6, 2020 at 6:13 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >
> > On Mon, Apr 06, 2020 at 05:47:48PM -0400, James Coleman wrote:
> > >On Mon, Apr 6, 2020 at 5:40 PM Tomas Vondra
> > ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > >>
> > >> On Mon, Apr 06, 2020 at 11:12:32PM +0200, Tomas Vondra wrote:
> > >> >On Mon, Apr 06, 2020 at 04:54:38PM -0400, Alvaro Herrera wrote:
> > >> >>On 2020-Apr-06, Tom Lane wrote:
> > >> >>
> > >> >>>Locally, things pass without force_parallel_mode, but turning it on
> > >> >>>produces failures that look similar to rhinoceros's (didn't examine
> > >> >>>other BF members).
> > >> >>
> > >> >>FWIW I looked at the eight failures there were about fifteen minutes ago
> > >> >>and they were all identical. I can confirm that, in my laptop, the
> > >> >>tests work without that GUC, and fail in exactly that way with it.
> > >> >>
> > >> >
> > >> >Yes, there's a thinko in show_incremental_sort_info() and it returns too
> > >> >soon. I'll push a fix in a minute.
> > >> >
> > >>
> > >> OK, I've pushed a fix - this should make the buildfarm happy again.
> > >>
> > >> It however seems to me a bit more needs to be done. The fix makes
> > >> show_incremental_sort_info closer to show_sort_info, but not entirely
> > >> because IncrementalSortState does not have sort_Done flag so it still
> > >> depends on (fullsortGroupInfo->groupCount > 0). I haven't noticed that
> > >> before, but not having that flag seems a bit weird to me.
> > >>
> > >> It also seems possibly incorrect - we may end up with
> > >>
> > >> fullsortGroupInfo->groupCount == 0
> > >> prefixsortGroupInfo->groupCount > 0
> > >>
> > >> but we won't print anything.
> > >
> > >This shouldn't ever be possible, because the only way we get any
> > >prefix groups at all is if we've already sorted a full sort group
> > >during the mode transition.
> > >
> > >> James, any opinion on this? I'd say we should restore the sort_Done flag
> > >> and make it work as in plain Sort. Or some comment explaining why
> > >> depending on the counts is OK (assuming it is).
> > >
> > >There's previous email traffic on this thread about that (I can look
> > >it up later this evening), but the short of it is that I believe that
> > >relying on the group count is actually more correct than a sort_Done
> > >flag in the case of incremental sort (in contrast to regular sort).
> > >
> >
> > OK. Maybe we should add a comment to explain.c saying it's OK.
> >
> > I've pushed a fix for failures due to different planned workers (in the
> > test I added to show changes due to add_partial_path tweaks).
> >
> > It seems we're not out of the woods yet, though. rhinoceros and
> > sidewinder failed with something like this:
> >
> > Sort Method: quicksort Memory: NNkB
> > + Sort Method: unknown Disk: NNkB
> >
> > Would you mind investigating at it?
>
> I assume that means those build farm members run with very low
> work_mem? Is it an acceptable fix to adjust work_mem up a bit just for
> these tests? Or is that bad practice and these are to expose issues
> with changing into disk sort mode?

On rhinoceros I see:

================== pgsql.build/src/test/regress/regression.diffs
===================
diff -U3 /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out
/opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out
--- /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out
2020-03-14 10:37:49.156761104 -0700
+++ /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out
2020-04-06 16:01:13.766798059 -0700
@@ -1328,8 +1328,9 @@
-> Sort (actual rows=3 loops=1)
Sort Key: sq_limit.c1, sq_limit.pk
Sort Method: top-N heapsort Memory: xxx
+ Sort Method: unknown Disk: 0kB
-> Seq Scan on sq_limit (actual rows=8 loops=1)
-(6 rows)
+(7 rows)

Same on sidewinder.

Given the 0kB I'm not sure this is *just* a work_mem thing, though
that's still something I'm curious to know about, and it's still part
of the "problem" here.

I'm investigating further.

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-06 23:31:43 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message James Coleman 2020-04-06 23:09:11 Re: [PATCH] Incremental sort (was: PoC: Partial sort)