Re: Windows default locale vs initdb

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows default locale vs initdb
Date: 2021-04-19 10:52:27
Message-ID: CAD5tBcJ8JapkFFxxVTsmyUtkFHJ=QkCG5PTOd47y=jy9Nz0_=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 19, 2021 at 4:53 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

>
>
> po 19. 4. 2021 v 7:43 odesílatel Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
> napsal:
>
>> Hi,
>>
>> Moving this topic into its own thread from the one about collation
>> versions, because it concerns pre-existing problems, and that thread
>> is long.
>>
>> Currently initdb sets up template databases with old-style Windows
>> locale names reported by the OS, and they seem to have caused us quite
>> a few problems over the years:
>>
>> db29620d "Work around Windows locale name with non-ASCII character."
>> aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
>> db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
>> 9f12a3b9 "Tolerate version lookup failure for old style Windows locale..."
>>
>> ... and probably more, and also various threads about , for example,
>> "German_German.1252" vs "German_Switzerland.1252" which seem to get
>> confused or badly canonicalised or rejected somewhere in the mix.
>>
>> I hadn't focused on any of that before, being a non-Windows-user, but
>> the entire contents of win32setlocale.c supports the theory that
>> Windows' manual meant what it said when it said[1]:
>>
>> "We do not recommend this form for locale strings embedded in
>> code or serialized to storage, because these strings are more likely
>> to be changed by an operating system update than the locale name
>> form."
>>
>> I suppose that was the only form available at the time the code was
>> written, so there was no choice. The question we asked ourselves
>> multiple times in the other thread was how we're supposed to get to
>> the modern BCP 47 form when creating the template databases. It looks
>> like one possibility, since Vista, is to call
>> GetUserDefaultLocaleName()[2], which doesn't appear to have been
>> discussed before on this list. That doesn't allow you to ask for the
>> default for each individual category, but I don't know if that is even
>> a concept for Windows user settings. It may be that some of the other
>> nearby functions give a better answer for some reason. But one thing
>> is clear from a test that someone kindly ran for me: it reports
>> standardised strings like "en-NZ", not strings like "English_New
>> Zealand.1252".
>>
>> No patch, but I wondered if any Windows hackers have any feedback on
>> relative sanity of trying to fix all these problems this way.
>>
>
> Last weekend I talked with one user about one interesting (and messing)
> issue. They needed to create a new database with Czech collation on Azure
> SAS. There was not any entry in pg_collation for Czech language. The reply
> from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0'
> ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.
>
>
>
My understanding from Microsoft staff at conferences is that Azure's
PostgreSQL SAS runs on linux, not WIndows.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-04-19 10:57:11 Re: Windows default locale vs initdb
Previous Message Amit Kapila 2021-04-19 09:59:49 Re: Table refer leak in logical replication