Skip site navigation (1) Skip section navigation (2)

Re: locales and encodings on Windows

From: Aleksander Kmetec <aleksander(dot)kmetec(at)intera(dot)si>
To: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: locales and encodings on Windows
Date: 2004-11-11 06:37:03
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers-win32
Come on, people. This was the second time I reported this bug and also 
the second time nobody responded to my report. :-(

If it is indeed not possible to initdb with a utf8 (65001) locale, then 
this will cause a flood of bug reports once a large number of people 
start using PG on Windows. Can somebody try and confirm this problem? 
Simply try running initdb with a --locale value of german_germany.65001, 
spanish_spain.65001, french_france.65001 or any other locale you think 
should be supported by your system. You will need to do this from the 
command line, not from the installer. Does initdb accept this value or 
does it replace it with your current system locale?

Unless somebody can come up with a solution, my suggestion for a 
work-around would be to remove unsupported encodings from the installer 
or at least warn users that their database will not be fully functional 
if they happen to choose one of the unsupported encodings.

Any comments?

Last October there was a discussion on pgsql-hackers about writing 
locale support for PG, so it wouldn't depend on the system for locale 
functionality any more. Is anyone still working on that?


Aleksander Kmetec wrote:
> I would like to bring to your attention a problem regarding locale 
> support on Windows. The description below uses UNICODE/UTF8, but the 
> issue isn't limited to just this encoding.
> Because Postgres relies on the operating system for some string related 
> functions, the OS needs to support the same encoding as the one that is 
> used as the database encoding. Unfortunately, Windows does not support 
> some encodings that are available as server-side encodings for PG.
> Here is a short example in case the previous paragraph doesn't make much 
> sense: with a UNICODE database (actually UTF8) you need to use a 
> compatible locale when running initdb; in my case that's "sl_SI.utf8" 
> (on Linux) or "Slovenian_Slovenia.65001" (on Windows).
> 65001 is Windows codepage number for utf8; except it's not a really a 
> valid codepage. The document at 
> states that: "65000 (UTF-7) 
> and  65001 (UTF-8) are pseudo codepages. There are no corresponding NLS 
>  files. The code page IDs can only be used with WideCharToMultiByte( ) 
>  and MultiByteToWideChar( ) API calls."
> This means that UPPER(), LOWER() and ORDER BY do not work correctly for 
>  unicode databases. Currently it's not even possible to run initdb with 
> a  locale which uses 65001 encoding. A small change to initdb enabled me 
>  to set LC_COLLATE to Slovenian_Slovenia.65001, but the sort order was 
>  still badly messed up, which makes sense considering the above quote.
> After some checking I came up with this list of encodings which are 
> supported by PG, but not mentioned anywhere as supported by Windows:
> UTF8
> LATIN6 (ISO 8859-10/ECMA 144)
> LATIN7 (ISO 8859-13)
> LATIN8 (ISO 8859-14)
> LATIN10 (ISO 8859-16/ASRO SR 14111)
> Is there a solution for this, other than marking these encodings as not 
> available on Windows?
> Regards,
> Aleksander
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

In response to

pgsql-hackers-win32 by date

Next:From: Magnus HaganderDate: 2004-11-11 08:42:41
Subject: Re: postgresql 8 beta 4 will not install
Previous:From: Paul KirschnerDate: 2004-11-11 01:20:35
Subject: Re: postgresql 8 beta 4 will not install

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group