Re: main log encoding problem

From: Tatsuo Ishii <ishii(at)postgresql(dot)org>
To: exclusion(at)gmail(dot)com
Cc: ringerc(at)ringerc(dot)id(dot)au, pgsql-general(at)postgresql(dot)org, yi(dot)codeplayer(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: main log encoding problem
Date: 2012-07-19 07:49:51
Message-ID: 20120719.164951.1616935804595452729.t-ishii@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

> Hello,
>>
>> Implementing any of these isn't trivial - especially making sure
>> messages emitted to stderr from things like segfaults and dynamic
>> linker messages are always correct. Ensuring that the logging
>> collector knows when setlocale() has been called to change the
>> encoding and translation of system messages, handling the different
>> logging output methods, etc - it's going to be fiddly.
>>
>> I have some performance concerns about the transcoding required for
>> (b) or (c), but realistically it's already the norm to convert all the
>> data sent to and from clients. Conversion for logging should not be a
>> significant additional burden. Conversion can be short-circuited out
>> when source and destination encodings are the same for the common case
>> of logging in utf-8 or to a dedicated file.
>>
> The initial issue was that log file contains messages in different
> encodings. So transcoding is performed already, but it's not

This is not true. Transcoding happens only when PostgreSQL is built
with --enable-nls option (default is no nls).

> consistent and in my opinion this is the main problem.
>
>> I suspect the eventual choice will be "all of the above":
>>
>> - Default to (b) or (c), both have pros and cons. I favour (c) with a
>> - UTF-8 BOM to warn editors, but (b) is nice for people whose DBs are
>> - all in the system locale.
> As I understand UTF-8 is the default encoding for databases. And even
> when a database is in the system encoding, translated postgres
> messages still come in UTF-8 and will go through UTF-8 -> System
> locale conversion within gettext.

Again, this is not always true.

>> - Allow (a) for people who have many different DBs in many different
>> - encodings, do high volume logging, and want to avoid conversion
>> - overhead. Let them deal with the mess, just provide an additional %
>> - code for the encoding so they can name their per-DB log files to
>> - indicate the encoding.
>>
> I think that (a) solution can be an evolvement of the logging
> mechanism if there will be a need for it.
>> The main issue is just that code needs to be prototyped, cleaned up,
>> and submitted. So far nobody's cared enough to design it, build it,
>> and get it through patch review. I've just foolishly volunteered
>> myself to work on an automated crash-test system for virtual plug-pull
>> testing, so I'm not stepping up.
>>
> I see you point and I can prepare a prototype if the proposed (c)
> solution seems reasonable enough and can be accepted.
>
> Best regards,
> Alexander
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Law 2012-07-19 08:12:17 Re: main log encoding problem
Previous Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem

Browse pgsql-general by date

  From Date Subject
Next Message Daniel McGreal 2012-07-19 08:04:58 GENERATED columns
Previous Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Law 2012-07-19 08:12:17 Re: main log encoding problem
Previous Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem