Re: B-Tree index builds, CLUSTER, and sortsupport

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: B-Tree index builds, CLUSTER, and sortsupport
Date: 2014-11-06 10:51:45
Message-ID: 545B52C1.6010400@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/06/2014 02:30 AM, Peter Geoghegan wrote:
> Thanks for the review.
>
> On Wed, Nov 5, 2014 at 4:33 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> I looked at the changes to the code. The new code is clean and there is more
>> code re-use and improved readability. On possible further improvement would
>> be to move the preparation of SortSupport to a common function since this is
>> done three time in the code.
>
> The idea there is to have more direct control of sortsupport. With the
> abbreviated keys patch, abbreviation occurs based on a decision made
> by tuplesort.c. I can see why you'd say that, but I prefer to keep
> initialization of sortsupport structs largely concentrated in
> tuplesort.c, and more or less uniform regardless of the tuple-type
> being sorted.

Ok, I am fine with either.

>> Is there any case where we should expect any greater performance
>> improvement?
>
> The really compelling case is abbreviated keys - as you probably know,
> there is a patch that builds on this patch, and the abbreviated keys
> patch, so that B-Tree builds and CLUSTER can use abbreviated keys too.
> That doesn't really have much to do with this patch, though. The
> important point is that heap tuple sorting (with a query that has no
> client overhead, and involves one big sort) and B-Tree index creation
> both have their tuplesort as a totally dominant cost. The improvements
> for each case should be quite comparable, which is good (except, as
> noted in my opening e-mail, when heap/datum tuplesorting can use the
> onlyKey optimization, while B-Tree/CLUSTER tuplesorting cannot).

Ah, I see.

--
Andreas Karlsson

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2014-11-06 10:56:36 Re: two dimensional statistics in Postgres
Previous Message Gavin Flower 2014-11-06 10:50:57 Re: two dimensional statistics in Postgres