Re: Range Types, constructors, and the type system

From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Range Types, constructors, and the type system
Date: 2011-06-29 11:35:48
Message-ID: 613D83B8-CB7F-450C-A97B-BA6E3DB50374@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jun29, 2011, at 05:02 , Jeff Davis wrote:
> On Tue, 2011-06-28 at 22:20 +0200, Florian Pflug wrote:
>> I believe if we go that route we should make RANGEINPUT a full-blown
>> type, having "pair of bound" semantics. Adding a lobotomized version
>> just for the sake of range input feels a bit like a kludge to me.
>
> It's not a pair, because it can be made up of 0, 1, or 2 scalar values
> (unless you count infinity as one of those values, in which case 0 or
> 2). And without ordering, it's not clear that those values are really
> "bounds".

Hm, yeah, the lack of an ordering operator is trouble. There also seem
to be more problems with that idea, see below for that. So scratch
the idea of turning RANGEINPUT into a full-blown type...

> I don't think that having an extra type around is so bad. It solves a
> lot of problems, and doesn't seem like it would get in the way. And it's
> only for the construction of ranges out of scalars, which seems like the
> most natural place where a cast might be required (similar to casting an
> unknown literal, which is fairly common).

What I'm concerned about is how elegantly we'd be able to tie up all
the loose ends. What'd be the result of
select range(1,2)
for example? Or
create table (r rangeinput)
for that matter.

I think we'd want to forbid both of these, and more or less every other
use except
range(1,2)::<some range type>
but that seems to require special-casing RANGEINPUT in a lot of places.

If we don't restrict RANGEINPUT that way, I think we ought to provide
at least a basic set of operators and functions for it - e.g.
input, output, lower(), upper(), ...

*Pondering this*

But we can't do that easily, since RANGEINPUT would actually be a kind of
VARIANT type (i.e. can hold values of arbitrary types). That's something
that our type system doesn't really support. We do have RECORD, which is
similar in a way, but its implementation is about as intrusive as it
gets...

>> Alternatively, we could replace RANGEINPUT simply with ANYARRAY,
>> and add functions ANYRANGE->ANYRANGE which allow specifying the
>> bound operator (<, <= respectively >,>=) after construction.
>>
>> So you'd write (using the functions-as-fields syntax I believe
>> we support)
>> (ARRAY[1,2]::int8range).left_open.right_closed for '(1,2]'
>> and
>> ARRAY[NULL,2]::int8range for '[-inf,2]'
>
> I think we can rule this one out:
> * The functions-as-fields syntax is all but deprecated (or should be)

Is it? That's actually too bad, since I kinda like it. But anyway,
if that's a concern it could also be
range_bounds(ARRAY[1,2]::int8range, '(]')

> * That's hardly a readability improvement

Granted, it won't win any beauty contest, but

> * It still suffers similar problems as casting back and forth to text:
> ANYARRAY is too general, doesn't really take advantage of the type
> system, and not a great fit anyway.

I believe it alleviates the gravest problems of casting back and forth
to text. It doesn't have quoting issues and it doesn't potentially lose
information.

In any case, I wouldn't expect this to *stay* the only way to construct
a range forever. But I does have it's virtues for a first incarnation of
range type, I believe, mostly because it's completely unintrusive and
won't cause any backwards-compatbility headaches in the future

>> All assuming that modifying the type system to support polymorphic
>> type resolution based on the return type is out of the question... ;-)
>
> It's still not out of the question, but I thought that the intermediate
> type would be a less-intrusive alternative (and Robert seemed concerned
> about how intrusive it was).

I fear that the intermediate type will turn out to be quite intrusive,
at least if we try to handle all the corner cases and loose ends. And if
we don't, I'm concerned that we're painting ourselves into a corner here...

> There also might be a little more effort educating users if we selected
> the function based on the return type, because they might think that
> casting the inputs explicitly would be enough to get it to pick the
> right function. If it were a new syntax like RANGE[]::int8range, then I
> think it would be easier to understand.

There's certainly a risk of confusion here, simply because the relationship
between ANYRANGE and ANYLEMENT will be quite different than that of
ANYARRAY and ANYLEMENT. All we can do is state this very clearly in the
docs I think, and explain that it must be that way to support multiple
range types over the same base type.

best regards,
Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Albe Laurenz 2011-06-29 12:23:45 Bug in SQL/MED?
Previous Message Simon Riggs 2011-06-29 08:00:40 Re: time-delayed standbys