Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-25 02:08:54
Message-ID: CAAaqYe-mmeb9gSL7HMYX_YpH0jCv60rEeBzw3C8G95n0z+0Lxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 24, 2020 at 8:58 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Tue, Mar 24, 2020 at 8:23 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > While working on finding a test case to show rescan isn't implemented
> > properly yet, I came across a bug. At the top of
> > ExecInitIncrementalSort, we assert that eflags does not contain
> > EXEC_FLAG_REWIND. But the following query (with merge and hash joins
> > disabled) breaks that assertion:
> >
> > select * from t join (select * from t order by a, b) s on s.a = t.a
> > where t.a in (1,2);
> >
> > The comments about this flag in src/include/executor/executor.h say:
> >
> > * REWIND indicates that the plan node should try to efficiently support
> > * rescans without parameter changes. (Nodes must support ExecReScan calls
> > * in any case, but if this flag was not given, they are at liberty to do it
> > * through complete recalculation. Note that a parameter change forces a
> > * full recalculation in any case.)
> >
> > Now we know that except in rare cases (as just discussed recently up
> > thread) we can't implement rescan efficiently.
> >
> > So is this a planner bug (i.e., should we try not to generate
> > incremental sort plans that require efficient rewind)? Or can we just
> > remove that part of the assertion and know that we'll implement the
> > rescan, albeit inefficiently? We already explicitly declare that we
> > don't support backwards scanning, but I don't see a way to declare the
> > same for rewind.
>
> Other nodes seem to get a materialization node placed above them to
> support this case "better". Is that something we should be doing?
>
> > I'm going to try the latter approach now to see if it at least solves
> > the immediate problem...
>
> So a couple of interesting results here. First, it does seem to fix
> the assertion failure, and rescan is being used in this case (as I
> expected).

I've fixed the rescan implementation and added a test. I think we
might actually be able to get away with one basic test (it just has to
exercise both full and prefix sort states). Note, this also allowed me
to get rid of the sort_Done incremental sort state attribute that I'd
wondered earlier if we'd actually need.

In the attached patch series I've collapsed my previous fix commits,
but included the changes here as a separate patch in the series so you
can see the changes more easily.

James

Attachment Content-Type Size
v41-0003-fix-rescan.patch text/x-patch 9.0 KB
v41-0004-Consider-incremental-sort-paths-in-additional-pl.patch text/x-patch 13.7 KB
v41-0005-A-couple-more-places-for-incremental-sort.patch text/x-patch 9.2 KB
v41-0001-Consider-low-startup-cost-when-adding-partial-pa.patch text/x-patch 4.6 KB
v41-0002-Implement-incremental-sort.patch text/x-patch 158.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Artur Zakirov 2020-03-25 02:12:05 Re: pg_upgrade fails with non-standard ACL
Previous Message Jeff Davis 2020-03-25 01:12:03 AllocSetEstimateChunkSpace()