Re: BUG #6401: IS DISTINCT FROM improperly compares geomoetric datatypes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, kenaniah <kenaniah(at)gmail(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6401: IS DISTINCT FROM improperly compares geomoetric datatypes
Date: 2012-01-19 15:39:42
Message-ID: 21582.1326987582@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Excerpts from Heikki Linnakangas's message of jue ene 19 07:25:36 -0300 2012:
>> Frankly that's such a rare corner case that I'm not very enthusiastic
>> about fixing it. One idea would be to look up the type's b-tree sort
>> operators, and pick the equality operator from there. But point datatype
>> doesn't have b-tree sort operators, either, so it wouldn't help in this
>> case.

> It doesn't have a hash opclass either, which could be used as a fallback
> in case there's no btree. Point cannot obviously have a btree opclass
> (no inequalities), but a hash one seems possible.

> I think the use case of IS NOT DISTINCT FROM for rowtypes in triggers is
> a valid one.

Note that IS [NOT] DISTINCT is not the only place that assumes that it
should use an operator named "=". There's also scalar IN, the simple
form of CASE, and possibly some others that I forget at the moment.
IMO, if we're going to change the semantics of any of these, we should
do them all together.

This is something I've kinda wanted to do for a long time, but never
gotten around to. We've managed to clean up hard-wired assumptions
about operator names in a lot of other places, but these syntactic
constructs still do it by name.

One argument against changing it is that arguably doing so would violate
the letter of the SQL standard. For example, I observe that SQL defines
the IN construct thus:

The expression

RVC IN IPV

is equivalent to

RVC = ANY IPV

(SQL99 8.4 <in predicate> syntax rule 4). The word "equality" appears
nowhere in the definition of IN. Thus, if we take "X IN (Y,Z,Q)" and
implement it with some operator not named "=", we have not done what
the spec clearly says to do. Now you can make the case that we'd be
implementing the spirit rather than the letter of the spec, but that's
a rather shaky case to have to make.

The same is true for simple CASE:

c) The <case specification> is equivalent to a <searched case>
in which each <searched when clause> specifies a <search
condition> of the form "CO=WO".

with absolutely no hint that equality is what the "=" symbol is supposed
to get you. And in 8.13 <distinct predicate> we have

Case:
i) "X IS DISTINCT FROM Y" is false if either:
1) X and Y are the null value, or
2) X = Y according to Subclause 8.2, "<comparison
predicate>".
ii) Otherwise, "X IS DISTINCT FROM Y" is true.

which at least suggests that what's wanted is equality, but they're
still defining it in terms of an operator named "=" (and AFAICS
subclause 8.2 doesn't address the possibility that "X=Y" could mean
something other than the common idea of equality).

So on the whole, it might be better to just provide an operator named
"=" for point, and not open up the can of worms about whether these
constructs should use some other rule for deciding which operator to
compare with.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2012-01-20 02:46:45 Re: BUG #6400: function arguments not accepted
Previous Message Andrea Grassi 2012-01-19 15:20:10 R: R: R: R: R: R: BUG #6342: libpq blocks forever in "poll" function