Re: [GENERAL] main log encoding problem

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Alexander Law <exclusion(at)gmail(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-general(at)postgresql(dot)org, ringerc(at)ringerc(dot)id(dot)au, yi(dot)codeplayer(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: [GENERAL] main log encoding problem
Date: 2012-07-19 08:58:42
Message-ID: CAF-3MvPWk_xo=sUFvJqYrGrnh4G-rps67rn_-SoEMvCvygymug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

On 19 July 2012 10:40, Alexander Law <exclusion(at)gmail(dot)com> wrote:
>>> Ok, maybe the time of real universal encoding has not yet come. Then
>>> we maybe just should add a new parameter "log_encoding" (UTF-8 by
>>> default) to postgresql.conf. And to use this encoding consistently
>>> within logging_collector.
>>> If this encoding is not available then fall back to 7-bit ASCII.
>>
>> What do you mean by "not available"?
>
> Sorry, it was inaccurate phrase. I mean "if the conversion to this encoding
> is not avaliable". For example, when we have database in EUC_JP and
> log_encoding set to Latin1. I think that we can even fall back to UTF-8 as
> we can convert all encodings to it (with some exceptions that you noticed).

I like Craig's idea of adding the client encoding to the log lines. A
possible problem with that (I'm not an encoding expert) is that a log
line like that will contain data about the database server meta-data
(log time, client encoding, etc) in the database default encoding and
database data (the logged query and user-supplied values) in the
client encoding. One option would be to use the client encoding for
the entire log line, but would that result in legible meta-data in
every encoding?

It appears that the primarly here is that SQL statements and
user-supplied data are being logged, while the log-file is a text file
in a fixed encoding.
Perhaps another solution would be to add the ability to log certain
types of information (not the core database server log info, of
course!) to a database/table so that each record can be stored in its
own encoding?
That way the transcoding doesn't have to take place until someone is
reading the log, you'd know what to transcode the data to (namely the
client_encoding of the reading session) and there isn't any issue of
transcoding errors while logging statements.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alban Hertroys 2012-07-19 09:01:49 Re: [GENERAL] main log encoding problem
Previous Message Tatsuo Ishii 2012-07-19 08:54:36 Re: main log encoding problem

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2012-07-19 09:01:49 Re: [GENERAL] main log encoding problem
Previous Message Tatsuo Ishii 2012-07-19 08:54:36 Re: main log encoding problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Alban Hertroys 2012-07-19 09:01:49 Re: [GENERAL] main log encoding problem
Previous Message Tatsuo Ishii 2012-07-19 08:54:36 Re: main log encoding problem