Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.

From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Hiroshi Saito <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
Subject: Re: Re: [COMMITTERS] pgsql: Explicitly bind gettext() to the UTF8 locale when in use.
Date: 2008-12-04 01:35:42
Message-ID: 493733EE.7000503@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Magnus Hagander wrote:
> Hiroshi Inoue wrote:
>>> I think the thing us that as long as the encodings are compatible
>>> (latin1 with different names for example) it worked fine.
>>>
>>>> In any case I think the problem is that gettext is
>>>> looking at a setting that is not what we are looking at. Particularly
>>>> with the 8.4 changes to allow per-database locale settings, this has
>>>> got to be fixed in a bulletproof way.
>> Attached is a new patch to apply bind_textdomain_codeset() to most
>> server encodings. Exceptions are PG_SQL_ASCII, PG_MULE_INTERNAL
>> and PG_EUC_JIS_2004. "EUC-JP" may be OK for EUC_JIS_2004.
>>
>> Unfortunately it's hard for Saito-san and me to check encodings
>> other than EUC-JP.
>
> In principle this looks good, I think, but I'm a bit worried around the
> lack of testing.

Thanks and I agree with you.

> I can do some testing under LATIN1 which is what we use
> in Sweden (just need to get gettext working *at all* in my dev
> environment again - I've somehow managed to break it), and perhaps we
> can find someone to do a test in an eastern-european locale to get some
> more datapoints?
>
> Can you outline the steps one needs to go through to show the problem,
> so we can confirm it's fixed in these locales?

Saito-san and I have been working on another related problem about
changing LC_MESSAGES locale properly under Windows and would be able
to provide a patch in a few days. It seems preferable for us to apply
the patch also so as to change the message catalog easily.

Attached is an example in which LC_MESSAGES is cht_twn(zh_TW) and
the server encoding is EUC-TW. You can see it as a UTF-8 text
because the client_encoding is set to UTF-8 first.

BTW you can see another problem at line 4 in the text.
At the point the LC_MESSAGES is still japanese and postgres fails
to convert a Japanese error message to EUC_TW encoding. There's
no wonder but it doesn't seem preferable.

regards,
Hiroshi Inoue

Attachment Content-Type Size
euctw.txt text/plain 704 bytes

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Kris Jurka 2008-12-04 02:42:43 Re: pgsql: Properly unregister OpenSSL callbacks when libpq is done with
Previous Message User Achernow 2008-12-04 00:05:21 libpqtypes - libpqtypes: add man for PQgetErrorField, link to PQgeterror

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2008-12-04 01:44:50 Re: Simple postgresql.conf wizard
Previous Message Gregory Stark 2008-12-04 01:18:01 Re: Simple postgresql.conf wizard