Re: Collation version tracking for macOS

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collation version tracking for macOS
Date: 2022-06-03 19:23:05
Message-ID: CA+hUKGLAAd+ftrUxJUV=Grgmf7gk0oVd_msG3oty2ufv-HVx+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 4, 2022 at 12:17 AM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
> On 07.05.22 02:31, Thomas Munro wrote:
> > Last time I looked into this it seemed like macOS's strcoll() gave
> > sensible answers in the traditional single-byte encodings, but didn't
> > understand UTF-8 at all so you get C/strcmp() order. In other words
> > there was effectively nothing to version.
>
> Someone recently told me that collations in macOS have actually changed
> recently and that this is a live problem. See explanation here:
>
> https://github.com/PostgresApp/PostgresApp/blob/master/docs/documentation/reindex-warning.md?plain=1#L66

How can I see evidence of this? I'm comparing Debian, FreeBSD and
macOS 12.4 and when I run "LC_COLLATE=en_US.UTF-8 sort
/usr/share/dict/words" I get upper and lower case mixed together on
the other OSes, but on the Mac the upper case comes first, which is my
usual smoke test for "am I looking at binary sort order?"

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-06-03 19:53:18 Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set
Previous Message Jeremy Schneider 2022-06-03 19:13:33 Re: Collation version tracking for macOS