Re: Windows UTF-8, non-ICU collation trouble

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows UTF-8, non-ICU collation trouble
Date: 2019-12-11 00:54:47
Message-ID: CA+hUKGLSbKhDTRa1JrgJM5NpqdKzJ5o4SLaKQXwTdG95BtP_TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 10, 2019 at 10:29 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Tue, Dec 10, 2019 at 03:41:15PM +1300, Thomas Munro wrote:
> > I ran a variation of your program on Appveyor's Studio/Server 2019
> > image, and the result was the same: it thinks that cmp(s1, s2) == 0,
> > cmp(s2, s3) == 0, but cmp(s1, s3) == 1, so the operator fails to be
> > transitive.
>
> If that test is captured in self-contained artifacts (a few config files, a
> public git repository, etc.), could you share them? If not, no need to
> assemble such artifacts. I probably won't use them, but I'd be curious to
> browse them if you've already assembled them.

https://ci.appveyor.com/project/macdice/locale-sort
https://github.com/macdice/locale-sort

To understand which operating systems the images mentioned in
appveyor.yml correspond to:

https://www.appveyor.com/docs/windows-images-software/

> This does suggest some set of CompareString* parameters is free from the
> problem. If that's right, we could offer collations based on that. (I'm not
> sure it would be worth offering; ICU may be enough.)

It would be nice to get to the bottom of that (for example, what is
the relationship between names like "Korean_XXX" and names like
"ko-KR"?), but I'm unlikely to investigate further (I have enough
trouble getting N kinds of Unix to do what I want). Generally I like
the idea of continuing to support and recommend both operating system
and ICU locales for different use cases. It should be easy to get all
the software on your system to agree on ordering, which seems like a
thing you should want as an application designer. The lack of
versioning is not a problem on Windows (see
https://commitfest.postgresql.org/26/2351/).

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-12-11 01:55:52 Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes
Previous Message Jim Finnerty 2019-12-10 22:50:29 Re: On disable_cost