Re: B-Tree support function number 3 (strxfrm() optimization)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Marti Raudsepp <marti(at)juffo(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thom(at)linux(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Date: 2014-09-12 00:34:45
Message-ID: CAM3SWZQCDCnfWd3qzoO4QmY4G8oKHUqyrd26bBLa7FL2x-nTjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 2:25 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I like that I don't have to care about every combination, and can
>> treat abbreviation abortion as the special case with the extra step,
>> in line with how I think of the optimization conceptually. Does that
>> make sense?
>
> No. comparetup_heap() is hip-deep in this optimization as it stands,
> and what I proposed - if done correctly - isn't going to make that
> significantly worse. In fact, it really ought to make things better:
> you should be able to set things up so that ssup->comparator is always
> the test that should be applied first, regardless of whether we're
> aborted or not-aborted or not doing this in the first place; and then
> ssup->tiebreak_comparator, if not NULL, can be applied after that.

I'm not following here. Isn't that at least equally true of what I've
proposed? Sure, I'm checking "if (!state->abbrevAborted)" first, but
that's irrelevant to the non-abbreviated case. It has nothing to
abort. Also, AFAICT we cannot abort and still call ssup->comparator()
indifferently, since sorttuple.datum1 fields are perhaps abbreviated
keys in half of all cases (i.e. pre-abort tuples), and uninitialized
garbage the other half of the time (i.e. post-abort tuples).

Where is the heap_getattr() stuff supposed to happen for the first
attribute to get an authoritative comparison in the event of aborting
(if we're calling ssup->comparator() on datum1 indifferently)? We
decide that we're going to use abbreviated keys within datum1 fields
up-front. When we abort, we cannot use datum1 fields at all (which
doesn't appear to matter for text -- the datum1 optimization has
historically only benefited pass-by-value types).

I'd mentioned that I'd hacked together a patch that doesn't
necessitate a separate state (if only to save a very small amount of
memory), but it is definitely messier within comparetup_heap(). I'm
still tweaking it. FYI, it does this within comparetup_heap():

+ if (!sortKey->abbrev_comparator)
+ {
+ /*
+ * There are no abbreviated keys to begin with (i.e.
no opclass support
+ * exists). Compare the leading sort key, assuming an
authoritative
+ * representation.
+ */
+ compare = ApplySortComparator(a->datum1, a->isnull1,
+
b->datum1, b->isnull1,
+
sortKey);
+ if (compare != 0)
+ return compare;
+
+ sortKey++;
+ nkey = 1;
+ }
+ else if (!state->abbrevAborted && sortKey->abbrev_comparator)
+ {
+ /*
+ * Leading attribute has abbreviated key representation, and
+ * abbreviation was not aborted when copying. Compare
the leading sort
+ * key using abbreviated representation.
+ */
+ compare = ApplySortAbbrevComparator(a->datum1, a->isnull1,
+
b->datum1, b->isnull1,
+
sortKey);
+ if (compare != 0)
+ return compare;
+
+ /*
+ * Since abbreviated comparison returned 0, call
tie-breaker comparator
+ * using original, authoritative representation, which
may break tie
+ * when differences were not captured within
abbreviated representation
+ */
+ nkey = 0;
+ }
+ else
+ {
+ /*
+ * Started with abbreviated keys, but aborted during
conversion/tuple
+ * copying -- check each attribute from scratch. It's
not safe to make
+ * any assumption about the state of individual datum1 fields.
+ */
+ nkey = 0;
+ }

Is doing all this worth the small saving in memory? Personally, I
don't think that it is, but I defer to you.
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kouhei Kaigai 2014-09-12 00:40:01 Re: [v9.5] Custom Plan API
Previous Message Michael Paquier 2014-09-12 00:17:53 Re: Incorrect initialization of sentPtr in walsender.c