Re: Windows default locale vs initdb

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Windows default locale vs initdb
Date: 2021-04-19 08:52:18
Message-ID: CAFj8pRD0NBwx25LSvDyEpQkO6vts6dW0okhDxUHUXEBxb1rCyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 19. 4. 2021 v 7:43 odesílatel Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
napsal:

> Hi,
>
> Moving this topic into its own thread from the one about collation
> versions, because it concerns pre-existing problems, and that thread
> is long.
>
> Currently initdb sets up template databases with old-style Windows
> locale names reported by the OS, and they seem to have caused us quite
> a few problems over the years:
>
> db29620d "Work around Windows locale name with non-ASCII character."
> aa1d2fc5 "Another attempt at fixing Windows Norwegian locale."
> db477b69 "Deal with yet another issue related to "Norwegian (Bokmål)"..."
> 9f12a3b9 "Tolerate version lookup failure for old style Windows locale..."
>
> ... and probably more, and also various threads about , for example,
> "German_German.1252" vs "German_Switzerland.1252" which seem to get
> confused or badly canonicalised or rejected somewhere in the mix.
>
> I hadn't focused on any of that before, being a non-Windows-user, but
> the entire contents of win32setlocale.c supports the theory that
> Windows' manual meant what it said when it said[1]:
>
> "We do not recommend this form for locale strings embedded in
> code or serialized to storage, because these strings are more likely
> to be changed by an operating system update than the locale name
> form."
>
> I suppose that was the only form available at the time the code was
> written, so there was no choice. The question we asked ourselves
> multiple times in the other thread was how we're supposed to get to
> the modern BCP 47 form when creating the template databases. It looks
> like one possibility, since Vista, is to call
> GetUserDefaultLocaleName()[2], which doesn't appear to have been
> discussed before on this list. That doesn't allow you to ask for the
> default for each individual category, but I don't know if that is even
> a concept for Windows user settings. It may be that some of the other
> nearby functions give a better answer for some reason. But one thing
> is clear from a test that someone kindly ran for me: it reports
> standardised strings like "en-NZ", not strings like "English_New
> Zealand.1252".
>
> No patch, but I wondered if any Windows hackers have any feedback on
> relative sanity of trying to fix all these problems this way.
>

Last weekend I talked with one user about one interesting (and messing)
issue. They needed to create a new database with Czech collation on Azure
SAS. There was not any entry in pg_collation for Czech language. The reply
from Microsoft support was to use CREATE DATABASE xxx TEMPLATE 'template0'
ENCODING 'utf8' LOCALE 'cs_CZ.UTF8' and it was working.

Regards

Pavel

> [1]
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-160
> [2]
> https://docs.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-getuserdefaultlocalename
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-04-19 09:03:10 Re: Table refer leak in logical replication
Previous Message Amit Langote 2021-04-19 08:46:45 Re: Doubt with [ RANGE partition with TEXT datatype ]