Re: B-Tree support function number 3 (strxfrm() optimization)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Date: 2014-11-10 03:02:48
Message-ID: CAM3SWZRw=f9UXnh0TFya=4kK39mm1qHUGxgTcAG9H2d-wF9cgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 11, 2014 at 6:34 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Attached patch, when applied, accelerates all tuplesort cases using
> abbreviated keys, building on previous work here, as well as the patch
> posted to that other thread.

I attach an updated patch set, rebased on top of the master branch's
tip. All relevant tuplesort cases (B-Tree, MinimalTuple, CLUSTER) are
now directly covered by the patch set, since there is now general
sortsupport support for those cases in the master branch -- no need to
apply some other patch from some other thread.

For the convenience of reviewers, this new revision includes a new
approach to making my improvements cumulative: A second commit adds
tuple count estimation. This hint, passed along to the text opclass's
convert routine, is taken from the optimizer's own estimate, or the
relcache's reltuples, depending on the tuplesort case being
accelerated. As in previous revisions, the idea is to give the opclass
a sense of proportion about how far along it is, to be weighed in
deciding whether or not to abort abbreviation. One potentially
controversial aspect of that is how the text opclass abbreviation cost
model/abort early stuff weighs simply having many tuples - past a
certain point, it *always* proceeds with abbreviation, not matter what
the cardinality of abbreviated keys so far is. For that reason it
particular, it seemed to make sense to split these parts out into a
second commit.

I hope that we can finish up all 9.5 work on accelerated sorting soon.
--
Peter Geoghegan

Attachment Content-Type Size
0002-Estimate-total-number-of-rows-to-be-sorted.patch text/x-patch 15.6 KB
0001-Abbreviated-sortsupport-keys.patch text/x-patch 60.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-11-10 03:31:08 Compiler warning in master branch
Previous Message Tomas Vondra 2014-11-10 02:34:52 Re: WIP: multivariate statistics / proof of concept