Re: main log encoding problem

From: Alexander Law <exclusion(at)gmail(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: pgsql-general(at)postgresql(dot)org, yi(dot)codeplayer(at)gmail(dot)com, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: main log encoding problem
Date: 2012-07-19 07:23:39
Message-ID: 5007B5FB.4020208@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

Hello,
>
> Implementing any of these isn't trivial - especially making sure
> messages emitted to stderr from things like segfaults and dynamic
> linker messages are always correct. Ensuring that the logging
> collector knows when setlocale() has been called to change the
> encoding and translation of system messages, handling the different
> logging output methods, etc - it's going to be fiddly.
>
> I have some performance concerns about the transcoding required for
> (b) or (c), but realistically it's already the norm to convert all the
> data sent to and from clients. Conversion for logging should not be a
> significant additional burden. Conversion can be short-circuited out
> when source and destination encodings are the same for the common case
> of logging in utf-8 or to a dedicated file.
>
The initial issue was that log file contains messages in different
encodings. So transcoding is performed already, but it's not consistent
and in my opinion this is the main problem.

> I suspect the eventual choice will be "all of the above":
>
> - Default to (b) or (c), both have pros and cons. I favour (c) with a
> UTF-8 BOM to warn editors, but (b) is nice for people whose DBs are
> all in the system locale.
As I understand UTF-8 is the default encoding for databases. And even
when a database is in the system encoding, translated postgres messages
still come in UTF-8 and will go through UTF-8 -> System locale
conversion within gettext.
>
> - Allow (a) for people who have many different DBs in many different
> encodings, do high volume logging, and want to avoid conversion
> overhead. Let them deal with the mess, just provide an additional %
> code for the encoding so they can name their per-DB log files to
> indicate the encoding.
>
I think that (a) solution can be an evolvement of the logging mechanism
if there will be a need for it.
> The main issue is just that code needs to be prototyped, cleaned up,
> and submitted. So far nobody's cared enough to design it, build it,
> and get it through patch review. I've just foolishly volunteered
> myself to work on an automated crash-test system for virtual plug-pull
> testing, so I'm not stepping up.
>
I see you point and I can prepare a prototype if the proposed (c)
solution seems reasonable enough and can be accepted.

Best regards,
Alexander

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem
Previous Message Alexander Law 2012-07-19 06:37:49 Re: main log encoding problem

Browse pgsql-general by date

  From Date Subject
Next Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem
Previous Message Alban Hertroys 2012-07-19 06:53:29 Re: Trouble with NEW

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2012-07-19 07:24:26 Re: main log encoding problem
Previous Message Bruce Momjian 2012-07-19 06:38:01 Re: Using pg_upgrade on log-shipping standby servers