Re: More message encoding woes

From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More message encoding woes
Date: 2009-04-02 12:03:21
Message-ID: 49D4A989.8020907@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>>> Tom Lane wrote:
>>>> Maybe use a special string "Translate Me First" that
>>>> doesn't actually need to be end-user-visible, just so no one sweats
>>>> over
>>>> getting it right in context.
>>
>>> Yep, something like that. There seems to be a magic empty string
>>> translation at the beginning of every po file that returns the
>>> meta-information about the translation, like translation author and
>>> date. Assuming that works reliably, I'll use that.
>>
>> At first that sounded like an ideal answer, but I can see a gotcha:
>> suppose the translation's author's name contains some characters that
>> don't convert to the database encoding. I suppose that would result in
>> failure, when we'd prefer it not to. A single-purpose string could be
>> documented as "whatever you translate this to should be pure ASCII,
>> never mind if it's sensible".
>
> I just tried that, and it seems that gettext() does transliteration, so
> any characters that have no counterpart in the database encoding will be
> replaced with something similar, or question marks. Assuming that's
> universal across platforms, and I think it is, using the empty string
> should work.
>
> It also means that you can use lc_messages='ja' with
> server_encoding='latin1', but it will be unreadable because all the
> non-ascii characters are replaced with question marks. For something
> like lc_messages='es_ES' and server_encoding='koi8-r', it will still
> look quite nice.
>
> Attached is a patch I've been testing. Seems to work quite well. It
> would be nice if someone could test it on Windows, which seems to be a
> bit special in this regard.

Unfortunately it doesn't seem to work on Windows.

First any combination of valid lc_messages and non-existent encoding
passes the test strcmp(gettext(""), "") != 0 .
Second for example the combination of ja(lc_messages) and ISO-8859-1
passes the the test but the test fails after I changed the last_trans
lator part of ja message catalog to contain Japanese kanji characters.

regards,
Hiroshi Inoue

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikhil Sontakke 2009-04-02 12:24:48 Re: Bug of ALTER TABLE DROP CONSTRAINT
Previous Message Robert Haas 2009-04-02 11:55:15 Re: Bug of ALTER TABLE DROP CONSTRAINT