Re: Abbreviated keys for Numeric

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Subject: Re: Abbreviated keys for Numeric
Date: 2015-03-21 06:58:20
Message-ID: 87oannf0f8.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Peter" == Peter Geoghegan <pg(at)heroku(dot)com> writes:

>> This is simply wrong. The reason why the cost model (in my version)
>> tracks non-null values by having its own counter is precisely
>> BECAUSE the passed-in memtupcount includes nulls, and therefore the
>> code will UNDERESTIMATE the fraction of successfully abbreviated
>> values if the comparison is based on memtupcount.

Peter> Oh, right. It's the other way around in your original.

Peter> I don't really buy it, either way. In what sense is a NULL value
Peter> ever abbreviated? It isn't. Whatever about the cost model,
Peter> that's the truth of the matter. There is always going to be a
Peter> sort of tension in any cost model, between whether or not it's
Peter> worth making it more sophisticated, and the extent to which
Peter> tweaking the model is chasing diminishing returns.

Comparisons between nulls and nulls, or between nulls and non-nulls, are
cheap; only comparisons between non-nulls and non-nulls can be
expensive.

The purpose of abbreviation is to replace expensive comparisons by cheap
ones where possible, and therefore the cost model used for abbreviation
should ignore nulls entirely; all that matters is the number of non-null
values and the probability of saving time by abbreviating them.

So if you're sorting a million rows of which 900,000 are null and
100,000 contain 50 different non-null values, then the absolute time
saved (not the proportion) by doing abbreviation should be on the same
order as the absolute time saved by abbreviation when sorting just the
100,000 non-null rows.

But if the cost model does 1,000,000/50 and gets 20,000, and decides
"that's worse than my 1 in 10,000 target, I'll abort abbreviations",
then you have sacrificed the time gain for no reason at all. This is
what I mean by "spurious". This is why the cost model must compute the
fraction as 100,000/50, ignoring the null inputs, if it's going to
perform anything like optimally in the presence of nulls.

Peter> By what objective standard is that spurious?

The objective standard of my wallclock.

Doing a simple count of values abbreviated is not anything that could be
considered "complex", and there is nothing at all "ad-hoc" about it.
Your method is simply incorrect on both logical and performance
grounds.

Peter> Your example has one abbreviated key in it, which is exactly
Peter> worthless among 9999 NULL values.

My example wasn't intended to imply that there would be only one
non-null value in the whole sort, merely to illustrate why nulls need to
be ignored so as not to incorrectly bias the model.

You do not know whether abbreviation will be useful until you have seen
a sufficient number of NON-NULL values. The problem of the initial
subset of data not necessarily being representative is not relevant to
this.

Peter> I also think that your explanation of the encoding schemes was
Peter> perfunctory.

I'm interested in other opinions on that, because I find your
replacement for it both confusingly organized and a bit misleading (for
example saying the top bit is "wasted" is wrong, it's reserved because
we need it free for the sign).

(It is true that mine assumes that the reader knows what "excess-N"
means, or can find out.)

Here's mine, which is given as a single block comment:

+ * Two different representations are used for the abbreviated form, one in
+ * int32 and one in int64, with the int64 one used if USE_FLOAT8_BYVAL is set
+ * (which despite the name is also the flag that says whether int64 is
+ * passed-by-value). In both cases the representation is negated relative to
+ * the original value, because we use the largest negative value for NaN, which
+ * sorts higher than other values. We convert the absolute value of the numeric
+ * to a 31-bit or 63-bit positive value, and then negate it if the original
+ * number was positive.
+ *
+ * The 31-bit value is constructed as:
+ *
+ * 0 + 7bits digit weight + 24 bits digit value
+ *
+ * where the digit weight is in single decimal digits, not digit words, and
+ * stored in excess-44 representation. The 24-bit digit value is the 7 most
+ * significant decimal digits of the value converted to binary. Values whose
+ * weights would fall outside the representable range are rounded off to zero
+ * (which is also used to represent actual zeros) or to 0x7FFFFFFF (which
+ * otherwise cannot occur). Abbreviation therefore fails to gain any advantage
+ * where values are outside the range 10^-44 to 10^83, which is not considered
+ * to be a serious limitation, or when values are of the same magnitude and
+ * equal in the first 7 decimal digits, which is considered to be an
+ * unavoidable limitation given the available bits. (Stealing three more bits
+ * to compare another digit would narrow the range of representable weights by
+ * a factor of 8, which starts to look like a real limiting factor.)
+ *
+ * (The value 44 for the excess is essentially arbitrary)
+ *
+ * The 63-bit value is constructed as:
+ *
+ * 0 + 7bits weight + 4 x 14-bit packed digit words
+ *
+ * The weight in this case is again stored in excess-44, but this time is it
+ * the original weight in digit words (i.e. powers of 10000). The first four
+ * digit words of the value (if present; trailing zeros are assumed as needed)
+ * are packed into 14 bits each to form the rest of the value. Again,
+ * out-of-range values are rounded off to 0 or 0x7FFFFFFFFFFFFFFF. The
+ * representable range in this case is 10^-176 to 10^332, which is considered
+ * to be good enough for all practical purposes, and comparison of 4 words
+ * means that at least 13 decimal digits are compared, which is considered to
+ * be a reasonable compromise between effectiveness and efficiency in computing
+ * the abbreviation.
+ *
+ * (The value 44 for the excess is even more arbitrary here, it was chosen just
+ * to match the value used in the 31-bit case)

Peter's, inline with the code (omitted here):

+ * First, store "weight" in first 7 binary digits (7 binary digits is
+ * the smallest number of digits that allows us to represent weights
+ * -44 to 83 inclusive).
+ *
+ * The weight is stored in excess-44 -- the original weight in digit
+ * words (i.e. powers of NBASE/10000). The "excess-K"/offset binary
+ * technique is also used with IEEE-754 floating point numbers. As
+ * with IEEE-754, we use an exponent without a sign (a 7-bit exponent
+ * without a sign). And as with IEEE-754, the lack of a sign does not
+ * imply that the exponent must be *logically* positive.
+ *
+ * Adding 44 to "weight" makes the smallest possible "exponent" 7
+ * binary digit number (where "res" hasn't already saturated to
+ * NUMERIC_ABBREV_NEG_INFINITY) have the value zero in this
+ * representation (since the smallest possible "weight" is -44). If
+ * the "weight" is 83, the largest possible, then "exponent" will have
+ * 0b1111111 as its first 7 binary digits. The 7-digit exponent is
+ * always positive (forgetting for a moment that it's really
+ * excess-44), and thus the bits could be equivalently interpreted as
+ * either signed or unsigned.
+ *
+ * We left shift 56 bits in order to have this 7-bit representation
+ * stored at the beginning of "exponent", a two's complement/signed
+ * 64-bit integer. In doing so, an eighth bit is "wasted".
[...]
+ * Append 4 x 14-bit packed digit words into remaining 56 bits.
+ * 14-bits of storage should be enough to represent the largest
+ * possible base-NBASE digit, despite the fact that those are stored as
+ * int16.
[...]
+ * Flip the abbreviated int64 representation's sign, for positive numerics
+ * only, since the 63-bit value is currently the absolute value. With
+ * this, the last detail of encoding things such that int64 comparisons can
+ * be interpreted backwards (within the abbreviated comparator, as a proxy
+ * for actual numeric comparisons) is complete.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2015-03-21 07:00:37 Re: Remove fsync ON/OFF as a visible option?
Previous Message Jaime Casanova 2015-03-21 06:28:22 Re: Remove fsync ON/OFF as a visible option?