Re: Re: Abbreviated keys for Datum tuplesort

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: Abbreviated keys for Datum tuplesort
Date: 2015-03-14 00:54:46
Message-ID: CAM3SWZTVMsES7Er8V9_nvs6byNApG1_7U2iSjYWsdjExcZsX=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 25, 2015 at 3:15 AM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> Robert> I think this is a good idea. Do you have a test case that
> Robert> shows the benefit?
>
> The best test case for datum sort performance is to use percentile_disc,
> since that has almost no overhead beyond performing the actual sort.

I attach a slightly tweaked version of Andrew's original.

This revision makes the reverted comments within orderedsetaggs.c
consistent with back branches (no need to call abbreviation out as an
interesting special case anymore, just as in the back branches, where
abbreviation doesn't even exist). Better to keep those consistent for
backpatching purposes. Also, I've changed back tuplesort.c header
comments to how they were back in November and until recently, to
reflect that now it really is the case that only the hash index case
doesn't have the "sortKeys" field reliably set. We now need to set
"sortKeys" for the datum case, so don't say that we don't...we need to
worry about the applicability of the onlyKey optimization for the
datum sort case now, too.

Other than that, there are a number of minor stylistic tweaks. The
datum case does not support pass-by-value abbreviation, which could be
useful in theory (e.g., abbreviation of float8 values, which was
briefly discussed before). This isn't worth calling out as a special
case in the tuplesort header comments IMV; there is now a brief note
on this added to tuplesort_begin_datum(). We still support
abbreviation for pass-by-value types for non-datumsort cases (there is
of course no known example of opclass abbreviation support for a
pass-by-value type, so this is only a theoretical deficiency).

I've marked this "ready for committer".

Thanks
--
Peter Geoghegan

Attachment Content-Type Size
datumsort-revisions.patch text/x-patch 8.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-03-14 01:11:52 Ye olde write_history() return code problem
Previous Message Tom Lane 2015-03-14 00:02:03 Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)