Re: BUG #7493: Postmaster messages unreadable in a Windows console

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alexander Law <exclusion(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: BUG #7493: Postmaster messages unreadable in a Windows console
Date: 2013-02-10 21:02:59
Message-ID: 20130210210259.GA7401@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

On Wed, Jan 30, 2013 at 10:00:01AM +0400, Alexander Law wrote:
> 30.01.2013 05:51, Noah Misch wrote:
>> On Tue, Jan 29, 2013 at 09:54:04AM -0500, Tom Lane wrote:
>>> Alexander Law <exclusion(at)gmail(dot)com> writes:
>>>> Please look at the following l10n bug:
>>>> http://www.postgresql.org/message-id/502A26F1.6010109@gmail.com
>>>> and the proposed patch.

>> Even then, I wouldn't be surprised to find problematic consequences beyond
>> error display. What if all the databases are EUC_JP, the platform encoding is
>> KOI8, and some postgresql.conf settings contain EUC_JP characters? Does the
>> postmaster not rely on its use of SQL_ASCII to allow those values?
>>
>> I would look at fixing this by making the error output machinery smarter in
>> this area before changing the postmaster's notion of server_encoding.

With your proposed change, the problem will resurface in an actual SQL_ASCII
database. At the problem's root is write_console()'s assumption that messages
are in the database encoding. pg_bind_textdomain_codeset() tries to make that
so, but it only works for encodings with a pg_enc2gettext_tbl entry. That
excludes SQL_ASCII, MULE_INTERNAL, and others. write_console() needs to
behave differently in such cases.

> Maybe I still miss something but I thought that
> postinit.c/CheckMyDatabase will switch encoding of a messages by
> pg_bind_textdomain_codeset to EUC_JP so there will be no issues with it.
> But until then KOI8 should be used.
> Regarding postgresql.conf, as it has no explicit encoding specification,
> it should be interpreted as having the platform encoding. So in your
> example it should contain KOI8, not EUC_JP characters.

Following some actual testing, I see that we treat postgresql.conf values as
byte sequences; any reinterpretation as encoded text happens later. Hence,
contrary to my earlier suspicion, your patch does not make that situation
worse. The present situation is bad; among other things, current_setting() is
a vector for injecting invalid text data. But unconditionally validating
postgresql.conf values in the platform encoding would not be an improvement.
Suppose you have a UTF-8 platform encoding and KOI8R databases. You may wish
to put KOI8R strings in a GUC, say search_path. That's possible today; if we
required that postgresql.conf conform to the platform encoding and no other,
it would become impossible. This area warrants improvement, but doing so will
entail careful design.

Thanks,
nm

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-02-10 23:22:39 Re: BUG #7865: Unexpected error code on insert of duplicate to composite primary key
Previous Message Jeff Janes 2013-02-10 20:10:57 Re: BUG #7853: Incorrect statistics in table with many dead rows.

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2013-02-10 23:27:02 Re: to_number, to_char inconsistency.
Previous Message Szymon Guz 2013-02-10 20:24:13 Re: to_number, to_char inconsistency.

Browse pgsql-hackers by date

  From Date Subject
Next Message Phil Sorber 2013-02-10 21:03:38 Re: [PATCH] pg_isready (was: [WIP] pg_ping utility)
Previous Message Robert Haas 2013-02-10 20:53:42 Re: backup.sgml patch that adds information on custom format backups