Re: Open issues for collations

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Open issues for collations
Date: 2011-03-26 15:16:55
Message-ID: 80857DD4-EFBB-4E0E-A7A7-BE9529AF8634@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mar 26, 2011, at 12:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> ** Selecting a field from a record-returning function's output.
> Currently, we'll use the field's declared collation; except that
> if the field has default collation, we'll replace that with the common
> collation of the function's inputs, if any. Is either part of that
> sane? Do we need to make this work for functions invoked with other
> syntax than a plain function call, eg operator or cast syntax?

I am not an expert on this topic in any way. That having been said, the first part of that rule seems quite sane. The second part seems less clear, but probably also sane.

> ** What to do with domains whose declaration includes a COLLATE clause?
> Currently, we'll impute that collation to the result of a cast to the
> domain type --- even if the cast's input expression includes an
> explicit COLLATE clause.

I would have thought that an explicit COLLATE clause would trump any action at a distance.

> * In plpgsql, is it OK for declared local variables to inherit the
> function's input collation? Should we provide a COLLATE option in
> variable declarations to let that be overridden? If Oracle understands
> COLLATE, probably we should look at what they do in PL/SQL.

I don't know what Oracle does, but a collate option in variable declarations seems like a very good idea. Inheriting the input collation if not specified seems good too. I also suspect we might need something like COLLATE FROM $1, but maybe that's a 9.2 feature.

> * RI triggers should insert COLLATE clauses in generated queries to
> satisfy SQL2008 9.13 SR 4a, which says that RI comparisons use the
> referenced column's collation. Right now you may get either table's
> collation depending on which query type is involved. I think an obvious
> failure may not be possible so long as equality means the same thing in
> all collations, but it's definitely possible that the planner might
> decide it can't use the referenced column's unique index, which would
> suck for performance. (Note: this rule seems to prove that the
> committee assumes equality can mean different things in different
> collations, else they'd not have felt the need to specify.)

No idea what to do about this.

> * It'd sure be nice if we had some nontrivial test cases that work in
> encodings besides UTF8. I'm still bothered that the committed patch
> failed to cover single-byte-encoding cases in upper/lower/initcap.

Or this.

> * Remove initdb's warning about useless locales? Seems like pointless
> noise, or at least something that can be relegated to debug mode.

+1.

> * Is it worth adding a cares-about-collation flag to pg_proc? Probably
> too late to be worrying about such refinements for 9.1.

Depends how much knock-on work it'll create.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-03-26 15:46:09 Re: 9.1 Beta
Previous Message Greg Stark 2011-03-26 15:15:13 Re: Open issues for collations