Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Date: 2011-03-11 21:20:16
Message-ID: 20110311212015.GA27117@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 10, 2011 at 05:51:31PM -0500, Tom Lane wrote:
> No, that's not what I'm on about. Consider
>
> (((A COLLATE X) || B) || (C COLLATE Y)) < (D COLLATE Z)
>
> (I've spelled out the parenthesization in full for clarity, but most
> of these parens could be omitted.) Is this expression legal, or
> should the "<" operator be throwing an error for conflicting
> explicitly-derived collations? Our code as it stands will take it,
> because no individual operator sees more than one COLLATE among its
> arguments. But I'm not sure this is right. The only text I can find
> in SQL2008 that seems to bear on the point is in 4.2.2:

The rules are essentially as described here:

http://msdn.microsoft.com/en-us/library/ms179886.aspx

So:

(A COLLATE X) => collation X
((A COLLATE X) || B) => collation X
(((A COLLATE X) || B) || (C COLLATE Y)) => error

If we aren't erroring on this then we're doing it wrong. The whole
point of going through the parse tree and assigning a collation to each
node is to catch these things.

> As I read this, the collation attached to any Var clause is implicit
> (because it came from the Var's data type), and the collation attached
> to a CollateClause is presumably explicit, but where does it say what
> happens at higher levels in the expression tree? It's at least arguable
> that the result collation of an expression is explicit if its input
> collation was explicit. The fact that the default in case of doubt
> apparently is supposed to be "explicit" doesn't give any aid or comfort
> to your position either. If explicitness comes only from the immediate
> use of COLLATE, why don't they say that? This is worded to make one
> think that most cases will have explicit derivation, not only COLLATE.

See 9.3 "Data types of results of aggregations" clause (ii). It
contains essentially the rules outlined by the Transact-SQL page above.

The collation derivation and declared type collation of the result are
determined as follows.
Case:
1) If some data type in DTS has an explicit collation derivation and
declared type collation
EC1, then every data type in DTS that has an explicit collation
derivation shall have a declared
type collation that is EC1. The collation derivation is explicit and
the collation is EC1.
2) If every data type in DTS has an implicit collation derivation, then
Case:
A) If every data type in DTS has the same declared type collation IC1,
then the collation
derivation is implicit and the declared type collation is IC1.
B) Otherwise, the collation derivation is none.
3) Otherwise, the collation derivation is none.

In my implementation I needed to expand this to the general set of
operators postgresql supported and relaxed this to only consider
arguments to the function/operator that had the same type as the
resulting type of the function/operator, since that's the only thing
that makes sense.

A concatination then requires its arguments to be compatible. A substr
has the collation of its sole string argument.

I hope this helps,

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-11 21:56:31 Re: FuncExpr.collid/OpExpr.collid unworkably serving double duty
Previous Message Kevin Grittner 2011-03-11 20:10:05 Re: Re: [COMMITTERS] pgsql: Add missing keywords to gram.y's unreserved_keywords list.