From: | Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org> |
Subject: | Re: PoC: Partial sort |
Date: | 2016-12-05 05:04:43 |
Message-ID: | CAJrrPGcZXRWz4qs3G_LjfJ3k0jp5F3YEs0Xm5bYAY-_WkogfTg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Dec 2, 2016 at 4:05 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Sep 13, 2016 at 4:32 AM, Alexander Korotkov
> <aekorotkov(at)gmail(dot)com> wrote:
> > On Fri, Apr 8, 2016 at 10:09 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> >> On Wed, Mar 30, 2016 at 8:02 AM, Alexander Korotkov
> >> <aekorotkov(at)gmail(dot)com> wrote:
> >> > Hmm... I'm not completely agree with that. In typical usage partial
> sort
> >> > should definitely use quicksort. However, fallback to other sort
> >> > methods is
> >> > very useful. Decision of partial sort usage is made by planner. But
> >> > planner makes mistakes. For example, our HashAggregate is purely
> >> > in-memory.
> >> > In the case of planner mistake it causes OOM. I met such situation in
> >> > production and not once. This is why I'd like partial sort to have
> >> > graceful
> >> > degradation for such cases.
> >>
> >> I think that this should be moved to the next CF, unless a committer
> >> wants to pick it up today.
> >
> > Patch was rebased to current master.
>
> Just a few quick observations on this...
>
> It strikes me that the API contract change in ExecMaterializesOutput
> is pretty undesirable. I think it would be better to have a new
> executor node for this node rather than overloading the existing
> "Sort" node, sharing code where possible of course. The fact that
> this would distinguish them more clearly in an EXPLAIN plan seems
> good, too. "Partial Sort" is the obvious thing, but there might be
> even better alternatives -- maybe "Incremental Sort" or something like
> that? Because it's not partially sorting the data, it's making data
> that already has some sort order have a more rigorous sort order.
>
> I think that it will probably be pretty common to have queries where
> the data is sorted by (mostly_unique_col) and we want to get it sorted
> by (mostly_unique_col, disambiguation_col). In such cases I wonder if
> we'll incur a lot of overhead by feeding single tuples to the
> tuplesort stuff and performing lots of 1-item sorts. Not sure if that
> case needs any special optimization.
>
> I also think that the "HeapTuple prev" bit in SortState is probably
> not the right way of doing things. I think that should use an
> additional TupleTableSlot rather than a HeapTuple.
>
>
The feedback from the reviewer has received at the end of commitfest.
So Moved to next CF with "waiting on author" status.
Regards,
Hari Babu
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Haribabu Kommi | 2016-12-05 05:06:50 | Re: Parallel Index Scans |
Previous Message | Haribabu Kommi | 2016-12-05 05:03:31 | Re: [PATCH] Generic type subscription |