Skip site navigation (1) Skip section navigation (2)

Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)
Date: 2012-01-24 12:07:15
Message-ID: CAPpHfduv5bwwXgdErZvneX4bT0hDb6MLznk63+Ht9vGKBE2n1Q@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi!

New version of patch is attached.
On Thu, Dec 22, 2011 at 11:46 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> A few comments:
>
> * In range_gist_picksplit, it would be nice to have a little bit more
> intuitive description of what's going on with the nonEmptyCount and
> nonInfCount numbers. For instance, it appears to depend on the fact that
> a range must either be in nonEmptyCount or in nonInfCount. Also, can you
> describe the reason you're multiplying by two and taking the absolute
> value? It seems to work, but I am missing the intuition behind those
> operations.
>
total_count - 2*nonEmptyCount = (total_count - nonEmptyCount) -
nonEmptyCount = emptyCount - nonEmptyCount
So, it's really not evident. I've simplified it.



> * The penalty function is fairly hard to read still. At a high level, I
> think we're trying to accomplish a few things (in order from most to
> least important):
>  - Keep normal ranges separate.
>  - Avoid broadening the class of the original predicate (e.g. turning
> single-infinite into double-infinite).
>  - Avoid broadening (as determined by subtype_diff) the original
> predicate.
>  - Favor adding ranges to narrower original predicates.
>
> Do you agree? If so, perhaps we can attack those more directly and it
> might be a little more readable.
>
> Additionally, the arbitrary numbers might become a problem. Can we
> choose better constants there? They would still be arbitrary when
> compared with real numbers derived from subtype_diff, but maybe we can
> still do better than what's there.
>
I've changes some comments and add constants for penalty values.


> * Regarding the leftover "common" entries that can go to either side:
> what is the "delta" measure trying to accomplish? When I work through
> some examples, it seems to favor putting larger common ranges on the
> left (small delta) and smaller common ranges on the right (smaller
> delta). Why is that good? Or did I misread the code?
>
> Intuitively, I would think that we'd want to push the ranges with lower
> upper bounds to the left and higher lower bounds to the right -- in
> other words, recurse. Obviously, we'd need to make sure it terminated at
> some point, but splitting the common entries does seem like a smaller
> version of the original problem. Thoughts?
>
That was a bug. Actually, no "abs" is needed. Indeed it doesn't affect
result significantly.

-----
With best regards,
Alexander Korotkov.

Attachment: rangetypegist-0.6.patch.gz
Description: application/x-gzip (10.2 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2012-01-24 12:38:04
Subject: Re: Client Messages
Previous:From: Simon RiggsDate: 2012-01-24 11:22:16
Subject: Re: Online base backup from the hot-standby

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group