Re: [GENERAL] main log encoding problem

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Alexander Law <exclusion(at)gmail(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-general(at)postgresql(dot)org, ringerc(at)ringerc(dot)id(dot)au, yi(dot)codeplayer(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: [GENERAL] main log encoding problem
Date: 2012-07-19 12:16:10
Message-ID: CAF-3MvPEY4D-QkvTte=jvOKExLcVYd2-FjtvLHQsXX+Nf2k5UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

On 19 July 2012 13:50, Alexander Law <exclusion(at)gmail(dot)com> wrote:
>> I like Craig's idea of adding the client encoding to the log lines. A
>> possible problem with that (I'm not an encoding expert) is that a log
>> line like that will contain data about the database server meta-data
>> (log time, client encoding, etc) in the database default encoding and
>> database data (the logged query and user-supplied values) in the
>> client encoding. One option would be to use the client encoding for
>> the entire log line, but would that result in legible meta-data in
>> every encoding?
>
> I think then we get non-human readable logs. We will need one more tool to
> open and convert the log (and omit excessive encoding specification in each
> line).

Only the parts that contain user-supplied data in very different
encodings would not be "human readable", similar to what we already
have.

>> It appears that the primarly here is that SQL statements and
>> user-supplied data are being logged, while the log-file is a text file
>> in a fixed encoding.
>
> Yes, and in in my opinion there is nothing unusual about it. XML/HTML are
> examples of a text files with fixed encoding that can contain multi-language
> strings. UTF-8 is the default encoding for XML. And when it's not good
> enough (as Tatsou noticed), you still can switch to another.

Yes, but in those examples it is acceptable that the application fails
to write the output. That, and the output needs to be converted to
various different client encodings (namely that of the visitor's
browser) anyway, so it does not really add any additional overhead.

This doesn't hold true for database server log files. Ideally, writing
those has to be reliable (how are you going to catch errors
otherwise?) and should not impact the performance of the database
server in a significant way (the less the better). The end result will
probably be somewhere in the middle.

>> Perhaps another solution would be to add the ability to log certain
>> types of information (not the core database server log info, of
>> course!) to a database/table so that each record can be stored in its
>> own encoding?
>> That way the transcoding doesn't have to take place until someone is
>> reading the log, you'd know what to transcode the data to (namely the
>> client_encoding of the reading session) and there isn't any issue of
>> transcoding errors while logging statements.
>
> I don't think it would be the simplest solution of the existing problem. It
> can be another branch of evolution, but it doesn't answer the question -
> what encoding to use for the core database server log?

It makes that problem much easier. If you need the "human-readable"
logs, you can write those to a different log (namely one in the
database). The result is that the server can use pretty much any
encoding (or a mix of multiple!) to write its log files.

You'll need a query to read the human-readable logs of course, but
since they're in the database, all the tools you need are already
available to you.

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Craig Ringer 2012-07-19 12:31:46 Re: main log encoding problem
Previous Message Noah Misch 2012-07-19 11:53:15 Re: BUG #6712: PostgreSQL 9.2 beta2: alter table drop constraint does not work on inherited master table

Browse pgsql-general by date

  From Date Subject
Next Message Sergey Konoplev 2012-07-19 12:27:13 Re: Synchronization Master -> Slave (on slow connetion)
Previous Message Alexander Law 2012-07-19 11:50:44 Re: [GENERAL] main log encoding problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2012-07-19 12:31:46 Re: main log encoding problem
Previous Message Alexander Law 2012-07-19 11:50:44 Re: [GENERAL] main log encoding problem