Re: Range Types - typo + NULL string constructor

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Range Types - typo + NULL string constructor
Date: 2011-10-24 10:15:03
Message-ID: 4EA53AA7.7060205@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17.10.2011 01:09, Jeff Davis wrote:
> On Sat, 2011-10-15 at 01:46 +0300, Heikki Linnakangas wrote:
>> * The binary i/o format includes the length of the lower and upper
>> bounds twice, once explicitly in range_send, and second time within the
>> send-function of the subtype. Seems wasteful.
>
> Any ideas how to fix that? How else do I know how much the underlying
> send function will consume?

Oh, never mind. I was misreading the code, it's not sending the length
twice.

>> * range_constructor_internal - I think it would be better to move logic
>> to figure out the the arguments into the callers.
>
> Done.

The comment above range_constructor0() is now outdated.

>> * The gist support functions frequently call range_deserialize(), which
>> does catalog lookups. Isn't that horrendously expensive?
>
> Yes, it was. I have introduced a cached structure that avoids syscache
> lookups when it's the same range as the last lookup (the common case).

Hmm, I don't think that's safe. After Oid wraparound, a range type oid
might get reused for some other range type, and the cache would return
stale values. Extremely unlikely to happen by accident, but could be
exploited by an attacker.

>> * What exactly is canonical function supposed to return? It's not clear
>> what format one should choose as the canonical format when writing a
>> custom range type. And even for the built-in types, it would be good to
>> explain which types use which canonical format (I saw the discussions on
>> that, so I understand that might still be subject to change).
>
> The canonical function is just supposed to return a new range such that
> two equal values will have equal representations. I have listed the
> built-in discrete range types and their canonical form.
>
> As far as describing what a custom range type is supposed to use for the
> canonical form, I don't know which part is exactly unclear. There aren't
> too many rules to defining one -- the only guideline is that ranges of
> equal value going in should be put in one canonical representation.

Ok. The name "canonical" certainly hints at that, but it would be good
to explicitly state that guideline. As the text stands, it would seem
that a canonical function that maps "[1,7]" to "[1,8)", and also vice
versa, "[1,8)" to "[1,7]", would be valid. That would be pretty silly,
but it would be good to say something like "The canonical output for two
values that are equal, like [1,7] and [1,8), must be equal. It doesn't
matter which representation you choose to be the canonical one, as long
as two equal values with different formattings are always mapped to the
same value with same formatting"

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-10-24 10:40:53 Re: Separating bgwriter and checkpointer
Previous Message Pavel Stehule 2011-10-24 10:13:26 Re: [9.1] unusable for large views