Skip site navigation (1) Skip section navigation (2)

Re: How to pass around collation information

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to pass around collation information
Date: 2010-05-28 17:59:45
Message-ID: 1275069585.12068.17.camel@vanquo.pezone.net (view raw or flat)
Thread:
Lists: pgsql-hackers
On fre, 2010-05-28 at 20:22 +0300, Heikki Linnakangas wrote:
> It's also fundamentally wrong, collation is not a property of a datum 
> but of the operation.

> One way to approach this is to realize that it's already possible to
> use 
> multiple collations in a database. You just have to define separate <
> = 
>  > operators and operator classes for every collation, and change all 
> your queries to use the right operator depending on the desired 
> collation everywhere where you use < = > (including ORDER BYs, with
> the 
> USING <operator> syntax). The behavior is exactly what we want, it's 
> just completely inpractical, so we need something to do the same in a 
> less cumbersome way.

Well, maybe we should step back a little and work out what sort of
feature we actually want, if any.  The feature I'm thinking of is what
people might call "per-column locale", and the SQL standard defines
that.  It would look like this:

CREATE TABLE test (
    a varchar COLLATE de,
    b varchar COLLATE fr
);

SELECT * FROM test WHERE a > 'baz' ORDER BY b;

So while it's true that the collation is used by the operations (> and
ORDER BY), the information which collation to use comes with the data
values.  It's basically saying, a is in language "de", so sort it like
that unless told otherwise.  There is also an override syntax available,
like this:

SELECT * FROM test WHERE a COLLATE en > 'baz' ORDER BY b COLLATE sv;

But here again the collation is attached to a data value, and only from
there it is passed to the operator.  What is actually happening is

SELECT * FROM test WHERE (a COLLATE en) > 'baz' ORDER BY (b COLLATE sv);


What you appear to be describing is a "per-operation locale", which also
sounds valid, but it would be a different thing.  It might be thought of
as this:

SELECT * FROM test WHERE a (> COLLATE en) 'baz' ORDER BY COLLATE sv b;

with some suitable global default.


So which one of these should it be?



In response to

Responses

pgsql-hackers by date

Next:From: Bruce MomjianDate: 2010-05-28 18:04:40
Subject: Re: VPATH docs
Previous:From: Andy BalholmDate: 2010-05-28 17:50:07
Subject: Re: [BUGS] dividing money by money

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group