Skip site navigation (1) Skip section navigation (2)

locales and encodings on Windows

From: Aleksander Kmetec <aleksander(dot)kmetec(at)intera(dot)si>
To: pgsql-hackers-win32(at)postgresql(dot)org
Subject: locales and encodings on Windows
Date: 2004-11-06 01:26:19
Message-ID: 418C283B.1060905@intera.si (view raw or flat)
Thread:
Lists: pgsql-hackers-win32
I would like to bring to your attention a problem regarding locale 
support on Windows. The description below uses UNICODE/UTF8, but the 
issue isn't limited to just this encoding.

Because Postgres relies on the operating system for some string related 
functions, the OS needs to support the same encoding as the one that is 
used as the database encoding. Unfortunately, Windows does not support 
some encodings that are available as server-side encodings for PG.

Here is a short example in case the previous paragraph doesn't make much 
sense: with a UNICODE database (actually UTF8) you need to use a 
compatible locale when running initdb; in my case that's "sl_SI.utf8" 
(on Linux) or "Slovenian_Slovenia.65001" (on Windows).

65001 is Windows codepage number for utf8; except it's not a really a 
valid codepage. The document at 
http://www.sharmahd.com/tm/codepages.html states that: "65000 (UTF-7) 
and  65001 (UTF-8) are pseudo codepages. There are no corresponding NLS 
  files. The code page IDs can only be used with WideCharToMultiByte( ) 
  and MultiByteToWideChar( ) API calls."

This means that UPPER(), LOWER() and ORDER BY do not work correctly for 
  unicode databases. Currently it's not even possible to run initdb with 
a  locale which uses 65001 encoding. A small change to initdb enabled me 
  to set LC_COLLATE to Slovenian_Slovenia.65001, but the sort order was 
  still badly messed up, which makes sense considering the above quote.

After some checking I came up with this list of encodings which are 
supported by PG, but not mentioned anywhere as supported by Windows:
UTF8
EUC_CN
EUC_TW
LATIN6 (ISO 8859-10/ECMA 144)
LATIN7 (ISO 8859-13)
LATIN8 (ISO 8859-14)
LATIN10 (ISO 8859-16/ASRO SR 14111)

Is there a solution for this, other than marking these encodings as not 
available on Windows?

Regards,
Aleksander



Responses

pgsql-hackers-win32 by date

Next:From: Bruce MomjianDate: 2004-11-06 04:29:44
Subject: psql \! WIN32 cleanup
Previous:From: Andrew DunstanDate: 2004-11-05 23:39:02
Subject: Re: Isn't win32_make_absolute() a waste of

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group