Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)
Date: 2011-10-08 14:43:55
Message-ID: CAPpHfdsQrW74FpMe9ndb0PDddXGZNjvZvKJ-=hY0WMc0_Fh5tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Oct 8, 2011 at 1:01 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Fri, 2011-10-07 at 12:54 +0400, Alexander Korotkov wrote:
>
> > The first thing caught my eye in existing GiST code is idea of
> > subtype_float. float8 has limited precision and can't respresent, for
> > example, varlena values good enough. Even if we have large int8 value
> > we can loose lower bits, but data distribution can be so that these
> > bits are valuable. Wouldn't it better to have function like
> > subtype_diff_float which returns difference between two values of
> > subtype as an float? Using of such function could make penalty more
> > sensible to even small difference between values, and accordingly more
> > relevant.
>
> The reason I did it that way is for unbounded ranges. With
> subtype_diff_float, it's difficult for the GiST code to differentiate
> between [10,) and [100000,), because infinity minus anything is
> infinity. But when inserting the range [100,200), the penalty for the
> first one should be zero and the second one should have some positive
> penalty, right?
>
I meant that penalty can be determined as sum of difference of old and new
bounds of range, i.e. penalty = subtype_diff_float(new_lower, old_lower)
+ subtype_diff_float(old_upper, new_upper).
When we insert [100,200) into [10,+inf), union([100,200), [10,+inf))
= [10,+inf), so penalty = subtype_diff_float(10,10)
+ subtype_diff_float(+inf, +inf) = 0 + 0 = 0.
When we insert [100,200) into [100000,), union([100,200), [100000,+inf))
= [100,+inf), so penalty = subtype_diff_float(100,100000)
+ subtype_diff_float(+inf, +inf) = 99900 + 0 = 99900.

But, there are still the problem, when we'are inserting open interval when
there is no such open intervals yet. For example, we're going to insert
[0,+inf), while root page contains [0,10), [10,20), [20,30). Each penalty
will be infinity, while it seems to be better to insert it into [0,10). But,
it seems to me to be general limitation of current GiST interface, when we
have to express penalty in a single float.

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-10-08 15:04:56 Intermittent regression test failure from index-only scans patch
Previous Message Kevin Grittner 2011-10-08 14:40:28 Re: patch : Allow toast tables to be moved to a different tablespace