Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date: 2011-03-10 22:51:31
Message-ID: 25342.1299797491@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martijn van Oosterhout <kleptog(at)svana(dot)org> writes:
> On Thu, Mar 10, 2011 at 05:16:52PM -0500, Tom Lane wrote:
>> On the other hand ... one thing that's been bothering me is that
>> select_common_collation assumes that "explicit" collation derivation
>> doesn't bubble up in the tree, ie a COLLATE is only a forcing function
>> for the immediate parent expression node. It's not at all clear to me
>> that that's a correct reading of the spec.

> I beleive the current interpretation, that is the COLLATE only applies
> to levels above, is the correct interpretation. COLLATE binds tightly,
> so

> A op B COLLATE C parses as A op (B COLLATE C)

> which is why it works.

No, that's not what I'm on about. Consider

(((A COLLATE X) || B) || (C COLLATE Y)) < (D COLLATE Z)

(I've spelled out the parenthesization in full for clarity, but most
of these parens could be omitted.) Is this expression legal, or
should the "<" operator be throwing an error for conflicting
explicitly-derived collations? Our code as it stands will take it,
because no individual operator sees more than one COLLATE among its
arguments. But I'm not sure this is right. The only text I can find
in SQL2008 that seems to bear on the point is in 4.2.2:

Anything that has a declared type can, if that type is a
character string type, be associated with a collation applicable
to its character set; this is known as a declared type
collation. Every declared type that is a character string type
has a collation derivation, this being either none, implicit, or
explicit. The collation derivation of a declared type with a
declared type collation that is explicitly or implicitly
specified by a <data type> is implicit. If the collation
derivation of a declared type that has a declared type collation
is not implicit, then it is explicit. The collation derivation
of an expression of character string type that has no declared
type collation is none.

As I read this, the collation attached to any Var clause is implicit
(because it came from the Var's data type), and the collation attached
to a CollateClause is presumably explicit, but where does it say what
happens at higher levels in the expression tree? It's at least arguable
that the result collation of an expression is explicit if its input
collation was explicit. The fact that the default in case of doubt
apparently is supposed to be "explicit" doesn't give any aid or comfort
to your position either. If explicitness comes only from the immediate
use of COLLATE, why don't they say that? This is worded to make one
think that most cases will have explicit derivation, not only COLLATE.

I wonder if anyone can check the behavior of nested collate clauses in
DB2 or some other probably-spec-conforming database.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-10 23:12:00 Can't specify default collation?
Previous Message Martijn van Oosterhout 2011-03-10 22:31:43 Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty