Re: PoC: Partial sort

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>
Subject: Re: PoC: Partial sort
Date: 2014-01-27 19:26:39
Message-ID: CAPpHfdsSTnooHmssbyRFGc_uW5JAX4v0y=AhDZHArzCHNdxLyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi!

On Tue, Jan 21, 2014 at 3:24 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:

> On Tue, Jan 14, 2014 at 5:49 PM, Alexander Korotkov
> <aekorotkov(at)gmail(dot)com> wrote:
> >On Tue, Jan 14, 2014 at 12:54 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
> >> I've been trying it out in a few situations. I implemented a new
> >> enable_partialsort GUC to make it easier to turn on/off, this way it's
> a lot
> >> easier to test. The attached patch applies on top of
> partial-sort-5.patch
> >
> > I though about such option. Generally not because of testing convenience,
> > but because of overhead of planning. This way you implement it is quite
> > naive :)
>
> I don't understand. I had another look at this and cost_sort still
> seems like the best place to implement this, since that's where the
> patch decides how many pre-sorted columns to use. Both mergejoin and
> simple order-by plans call into it. If enable_partialsort=false then I
> skip all pre-sorted options except full sort, making cost_sort behave
> pretty much like it did before the patch.
>
> I could change pathkeys_common to return 0, but that seems like a
> generic function that shouldn't be tied to partialsort. The old code
> paths called pathkeys_contained_in anyway, which has similar
> complexity. (Apart for initial_cost_mergejoin, but that doesn't seem
> special enough to make an exception for).
>
> Or should I use?:
> enable_partialsort ? pathkeys_common(...) : 0
>
> > For instance, merge join rely on partial sort which will be
> > replaced with simple sort.
>
> Are you saying that enable_partialsort=off should keep
> partialsort-based mergejoins enabled?
>
> Or are you saying that merge joins shouldn't use "simple sort" at all?
> But merge join was already able to use full Sort nodes before your
> patch.
>

Sorry that I didn't explained it. In particular I mean following:
1) With enable_partialsort = off all mergejoin logic should behave as
without partial sort patch.
2) With partial sort patch get_cheapest_fractional_path_for_pathkeys
function is much more expensive to execute. With enable_partialsort = off
it should be as cheap as without partial sort patch.
I'll try to implement this option in this week.
For now, I have attempt to fix extra columns in mergejoin problem. It would
be nice if you test it.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
partial-sort-7.patch.gz application/x-gzip 17.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-01-27 19:55:42 Re: Failure while inserting parent tuple to B-tree is not fun
Previous Message Alvaro Herrera 2014-01-27 19:20:23 Re: [PATCH] Use MAP_HUGETLB where supported (v3)