Skip site navigation (1) Skip section navigation (2)

Re: locale-specific sort algorithms undocumented?

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,John Gunther <mail(at)bucksvsbytes(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: locale-specific sort algorithms undocumented?
Date: 2004-07-26 08:49:12
Message-ID: 200407261049.12346.peter_e@gmx.net (view raw or flat)
Thread:
Lists: pgsql-general
Tom Lane wrote:
> > I now find that sorting is very different with that setting: It
> > appears, through trial and error, that all non-alphanumeric
> > characters are completely ignored by ORDER BY.
>
> I doubt they are ignored completely, but they probably are ignored in
> the first-order comparison.

The way this more or less works is:

First pass: letters, numbers
Second pass: accents
Third pass: upper/lower case
Fourth pass: punctuation characters

This is all enshrined in various standards such as ISO/IEC 14651, 
national standards based on it, and independent technical standards 
such as the Unicode Collation Algorithm.

The latter in fact allows what many people appear to be looking for: a 
"variable weighting" option that allows you to promote punctuation 
characters to the first pass.  But I don't think any operating system 
implements that, yet.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


In response to

pgsql-general by date

Next:From: Geoff CaplanDate: 2004-07-26 08:58:11
Subject: Re: Sql injection attacks
Previous:From: Magnus HaganderDate: 2004-07-26 08:39:58
Subject: Re: Sql injection attacks

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group