Re: [GENERAL] trouble with to_char('L')

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] trouble with to_char('L')
Date: 2010-04-20 13:38:24
Message-ID: 201004201338.o3KDcOU00788@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Magnus Hagander wrote:
> > Another idea is to use GetLocaleInfoW() [1], that is win32 native locale
> > functions, instead of the libc one. It returns locale characters in wide
> > chars, so we can safely convert them as UTF16->UTF8->db. But it requires
> > an additional branch in our locale codes only for Windows.
>
> If we can go UTF16->db directly, it might be a good idea. If we're
> going via UTF8 anyway, I doubt it's going to be worth it.
>
> Let's work off what we have now to start with at least. Bruce, can you
> comment on that thing about the extra parameter? And UTF8?

I do like the idea of using UTF16 directly because that would eliminate
our need to even set LC_CTYPE for Win32 in this routine. That would
also eliminate any need to refer to the encoding for numeric/monetary,
so we could get rid of the odd case where their encoding is UTF8 but
their numeric/monetary locale settings have to use a non-UTF8 encoding.
For example, the original bug report has these locale settings:

http://archives.postgresql.org/pgsql-general/2009-04/msg00829.php

psql (PostgreSQL) 8.3.7

server_version 8.3.7
server_encoding UTF8
client_encoding win1252
lc_numeric Finnish, Finland
lc_monetary Finnish, Finland

but really needed to use "Finnish_Finland.1252":

http://archives.postgresql.org/pgsql-general/2009-04/msg00859.php

However, I noticed that both lc_collate and lc_ctype are set to
Finnish_Finland.1252 by the installer. Should I have just run initdb
with --locale fi_FI.UTF8 at the very start? The to_char('L') works
fine with a database with win1252 encoding.

Of course, that still does not work with our current CVS code if the
database encoding is UTF8, which is what we are trying to fix now.

I am not even sure how users set these things properly but I assume the
installer does all that magic. And, of course, if someone manually runs
initdb on Windows, they can easily set things wrong.

Magnus, if I remember correctly, all our non-UTF8 to UTF8 conversion
already has to pass through UTF16 as an intermediary case, so going to
UTF16 directly seems fine.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2010-04-20 13:50:20 Re: tar error, in pg_start_backup()
Previous Message Tom Lane 2010-04-20 13:36:22 Re: Int64GetDatum

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-04-20 13:40:05 Re: [HACKERS] Streaming replication document improvements
Previous Message Kevin Grittner 2010-04-20 13:35:06 Re: [DOCS] Streaming replication document improvements