Re: BUG #2317: Wrong sorting order for (VW)

From: Tomas Klockar <tomask(at)omicron(dot)se>
To: Magnus Hagander <mha(at)sollentuna(dot)net>
Cc: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #2317: Wrong sorting order for (VW)
Date: 2006-03-16 10:03:59
Message-ID: 4419380F.8000208@omicron.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Magnus Hagander wrote:

>>>The following bug has been logged online:
>>>
>>>Bug reference: 2317
>>>Logged by: TomasKlockar
>>>Email address: tomask(at)omicron(dot)se
>>>PostgreSQL version: 7.3.2/7.4.7
>>>Operating system: linux(fedora)
>>>Description: Wrong sorting order for (VW)
>>>Details:
>>>
>>>SELECT cname FROM clients ORDER BY cname;
>>>
>>>sorts W before V and that is a mistake.
>>>
>>>the result is
>>>
>>>V
>>>W
>>>WHI
>>>Vi
>>>Wi
>>>Volvo
>>>
>>>Correct order would be to place all vV before all wW.
>>>
>>>I think the database was initialized with UTF8 but it might
>>>
>>>
>>have been
>>
>>
>>>ISO_8859-1, however the error is easily repeatable and ill
>>>
>>>
>>be happy to
>>
>>
>>>provide a dump.
>>>
>>>I tested this on 7.3.2 and 7.4.7 and the error was repeatable.
>>>
>>>
>>I think the most important thing we'd need to know is what
>>locale the database was initialized with in order to try to reproduce.
>>
>>If you put similar data in a file and use the unix "sort"
>>command with the same locale, do you get the same order?
>>
>>
>
>That looks like the typical swedish locale, which sorts V and W as the
>same character. Yes, that can be very annoying for some of us :-), but
>that's the way it's defined.
>
>//Magnus
>
>
I would say that sorting V and W as the same character may work in a
dictionary or sometimes when sorting names, however in that case why
don't it sort C and K as the same character? and in some cases C and S
as the same.

The best sorting algorithm should sort all characters as separate
characters, and if you want fancy sorting you should need to turn it on
for a table.

Fancy sorting, like sorting Carlsson and Karlsson together since they
are equal in the same way as Viktor=Wiktor=Victor=Wictor. Cesar and
Sesar would also have to be sorted together since they are pronounced
the same in swedish, and if you look in a phonebook you would find them
at the same place.

Now I get the english words was and vas sorted together when I need them
separated.

Currently the sortingfunction in postgreSQL have disqualified itself
from beeing used, and I do the sorting in java which treat them as
separate caracters.

This is my 5 cent,

/Tomas

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Magnus Hagander 2006-03-16 10:08:24 Re: BUG #2317: Wrong sorting order for (VW)
Previous Message Magnus Hagander 2006-03-16 09:10:13 Re: BUG #2317: Wrong sorting order for (VW)