Re: Open issues for collations

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Open issues for collations
Date: 2011-03-27 21:38:59
Message-ID: 20110327213859.GB30031@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 27, 2011 at 03:14:37PM -0400, Tom Lane wrote:
> > So the result of a cast would be the collation of the specified
> > type/domain with state implicit.
>
> Hm. That makes sense for explicit CAST syntax, but what about a
> function returning a collatable type? In particular, applying this
> rule to the || operator would have us conclude that
>
> x || y COLLATE "foo"
>
> doesn't have an overall collation of "foo", which seems clearly wrong.
> But if you claim that it's all based on the result type of the function
> then it's hard to avoid that outcome.

I don't see this. There's no explicit cast in the above expression. The
return type says something, but that gets back to the question of
whether the return type indicates a cast to that type or not. In any
case, the SQL specifically states what should happen in the above case.

The question started with what happens when the function is declared to
return a domain type. I'm really not sure what to do there. I'd suggest
similar to how the typmod is propegated but I'm afraid that the answer
there is "we don't". What I suggested with combining it with the
collations of the input types is about the only concrete suggestion I
have.

> > Also, apparently the COLLATE clause as allowed anywhere where a
> > datatype is permitted. So you can say:
>
> > CAST( foo AS TEXT COLLATE "en_US" )
>
> No, you can't; this case is specifically disallowed by SQL:2008 6.12

Maybe they just thought it confusing that COLLATE would mean different
things in different contexts and forbade it.

> > (1) explicit states in the arguments will override it
> > (2) if arguments are implicit state and conflict with domain, the
> > result is no-collation, otherwise implicitly whatever the domain was
> > (3) no arguments have collation, which means you get the domain
> > default.
>
> > Which all seems eminently reasonable.
>
> It's reasonable by itself, but it's also rather different from what you
> just argued the behavior should be for field selection. I don't see
> the rationale for treating those cases differently. In fact, given that
> Postgres has always treated f(x) and x.f as equivalent notations,
> I think there's a pretty strong argument for wanting their effects on
> collation choice to be equivalent too.

Well, that's a good argument. But I'd suggest that here x.f is not a
column reference, so there's no reason it should be treated like one.
The SQL standard says column references determine collations but for
various strings operations it's the inputs that count. This makes sense
if you consider the return type declaration to be merely an assertion
and not a cast.

<snip about default collation on declared variables>

> No, I don't care for that, because then it matters whether an expression
> contains any direct references to the input arguments versus containing
> just local variables. In particular, assigning a parameter to a local
> variable and then using the local variable instead of the parameter
> could change the function's results. That seems mighty surprising and
> bug-prone to me. So I think it's important that local variables default
> to the function's input collation. I have no problem with adding
> explicit COLLATE to the declaration syntax for plpgsql variables,
> though.

Ok, you've convinced me here. You effectively set the default collation
for the period of the function, which I think is perfectly defensible.
We might allow people in the future to override it (a la search_path)
but this is a good start.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2011-03-27 21:45:03 Additional options for Sync Replication
Previous Message Dimitri Fontaine 2011-03-27 21:13:32 Re: DO hint update?