Re: WIP patch: distinguish selectivity of < from <= and > from >=

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP patch: distinguish selectivity of < from <= and > from >=
Date: 2017-07-06 21:23:12
Message-ID: 30820.1499376192@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> writes:
> On Thu, Jul 6, 2017 at 3:45 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> + * In the first bin (i==1), add a fudge factor that ensures
> + * that histfrac is at least eq_selec. We do this because we
> + * know that the first histogram entry does satisfy the
> + * inequality (if !isgt) or not satisfy it (if isgt), so our
> + * estimate here should certainly not be zero even if binfrac
> + * is zero. (XXX experimentally this is the correct way to do
> + * it, but why isn't it a linear adjustment across the whole
> + * histogram rather than just the first bin?)

> Given that the values are distinct, (I've some doubts for real number case)

Well, really, we are *always* dealing with discrete distributions here.

> if histogram_bounds are assigned as,
> {0,9,19,29,39,49,59,69,79,89,99,109,119,129,13,..........}
> ...
> So, if val=low, then hisfrac = (bucket_current - 1)/num_of_buckets
> which means it assumes val is included in the previous bucket.

I looked at that again and realized that one of the answers I was missing
lies in the behavior of analyze.c's compute_scalar_stats(). When it has
collected "nvals" values and wants to make a histogram containing
"num_hist" entries, it does this:

* The object of this loop is to copy the first and last values[]
* entries along with evenly-spaced values in between. So the
* i'th value is values[(i * (nvals - 1)) / (num_hist - 1)].

(where i runs from 0 to num_hist - 1). Because the "/" denotes integer
division, this effectively means that for all but the first entry,
we are taking the last value out of the corresponding bucket of samples.
That's obviously true for the last histogram entry, which will be the very
last sample value, and it's also true for earlier ones --- except for the
*first* histogram entry, which is necessarily the first one in its bucket.
So the first histogram bucket is actually a tad narrower than the others,
which is visible in the typical data you showed above: 0 to 9 is only 9
counts wide, but all the remaining buckets are 10 counts wide. That
explains why we need a scale adjustment just in the first bucket.

I think I'm also beginning to grasp why scalarineqsel's basic estimation
process produces an estimate for "x <= p" rather than some other condition
such as "x < p". For clarity, imagine that we're given p equal to the
last histogram entry. If the test operator is in fact "<=", then it will
say "true" for every histogram entry, and we'll fall off the end of the
histogram and return 1.0, which is correct. If the test operator is "<",
it will return "true" for all but the last entry, so that we end up with
"i" equal to sslot.nvalues - 1 --- but we will compute binfrac = 1.0,
if convert_to_scalar() produces sane answers, again resulting in
histfrac = 1.0. Similar reasoning applies for ">" and ">=", so that in
all cases we arrive at histfrac = 1.0 if p equals the last histogram
entry. More generally, if p is equal to some interior histogram entry,
say the k'th one out of n total, we end up with histfrac = (k-1)/(n-1)
no matter which operator we probe with, assuming that convert_to_scalar()
correctly gives us a binfrac of 0.0 or 1.0 depending on which of the
adjacent bins we end up examining. So, remembering that the histogram
entries occupy the right ends of their buckets, the proper interpretation
of that is that it's the probability of "x <= p". (This is wrong for the
first histogram entry, but that's why we need a correction there.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2017-07-06 21:43:44 Re: Multi column range partition table
Previous Message Dean Rasheed 2017-07-06 20:24:07 Re: Multi column range partition table