Re: How to pass around collation information

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to pass around collation information
Date: 2010-05-28 21:23:14
Message-ID: 20100528212313.GA18306@svana.org
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, May 28, 2010 at 10:32:34PM +0300, Peter Eisentraut wrote:
> On fre, 2010-05-28 at 14:48 -0400, Tom Lane wrote:
> > > SELECT * FROM test WHERE a COLLATE en > 'baz' ORDER BY b COLLATE sv;
> >
> > That seems fairly bizarre. What does this mean:
> >
> > WHERE a COLLATE en > b COLLATE de
> >
> > ? If it's an error, why is this not an error
> >
> > WHERE a COLLATE en > b
> >
> > if b is marked as COLLATE de in its table?
>
> The way I understand it, a collation "derivation" can be explicit or
> implicit. Explicit derivations override implicit derivations. If in
> the argument set of an operation, explicit collation derivations exist,
> they must all be the same.

The SQL standard has an explicit set of rules for determining the
collations of any particular operation (they apply to
operators/functions not to the datums).

The basic idea is that tables/columns/data types define an implicit
collation, which can be overidden by explicit collations. If there is
ambiguity you throw an error. I implemented this all several years ago,
it's not all that complicated really. IIRC I added a field to the Node type
and each level determined it's collection from the sublevels.

I solved the problem for the OP by providing an extra function to user
defined functions which would return the collation for that particular
call.

The more interesting question I found was that the standard only
defined collation for strings, whereas it can be applied much more
broadly. I described a possible solution several years back, it should
in the archives somewhere. It worked pretty well as I recall.

IIRC The idea was to let each type has its own set of collations and
when using an operator/function you let the collection be determined
by the argument that had the same type as the return type.

It would be nice if COLLATE could finally be implemented, it'd be quite
useful.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patriotism is when love of your own people comes first; nationalism,
> when hate for people other than your own comes first.
> - Charles de Gaulle

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan UrbaƄski 2010-05-28 21:47:28 Re: tsvector pg_stats seems quite a bit off.
Previous Message Jaime Casanova 2010-05-28 21:22:37 Re: List traffic