Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding

From: Alexander Law <exclusion(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org, ringerc(at)ringerc(dot)id(dot)au
Cc: yi(dot)codeplayer(at)gmail(dot)com
Subject: Re: BUG #6742: pg_dump doesn't convert encoding of DB object names to OS encoding
Date: 2012-07-18 15:05:07
Message-ID: 5006D0A3.5040309@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general pgsql-hackers

Hello!

May I to propose a solution and to step up?

I've read a discussion of the bug #5800 and here is my 2 cents.
To make things clear let me give an example.
I am a PostgreSQL hosting provider and I let my customers to create any databases they wish.
I have clients all over the world (so they can create databases with different encoding).

The question is - what I (as admin) want to see in my postgresql log, containing errors from all the databases?
IMHO we should consider two requirements for the log.
First, The file should be readable with a generic text viewer. Second, It should be useful and complete as possible.

Now I see following solutions.
A. We have different logfiles for each database with different encodings.
Then all our logs will be readable, but we have to look at them one by onе and it's inconvenient at least.
Moreover, our log reader should understand what encoding to use for each file.

B. We have one logfile with the operating system encoding.
First downside is that the logs can be different for different OSes.
The second is that Windows has non-Unicode system encoding.
And such an encoding can't represent all the national characters. So at best I will get ??? in the log.

C. We have one logfile with UTF-8.
Pros: Log messages of all our clients can fit in it. We can use any generic editor/viewer to open it.
Nothing changes for Linux (and other OSes with UTF-8 encoding).
Cons: All the strings written to log file should go through some conversation function.

I think that the last solution is the solution. What is your opinion?

In fact the problem exists even with a simple installation on Windows when you use non-English locale.
So the solution would be useful for many of us.

Best regards,
Alexander

On 05/23/2012 09:15 AM, yi huang wrote:
> I'm using postgresql 9.1.3 from debian squeeze-backports with
> zh_CN.UTF-8 locale, i find my main log (which is
> "/var/log/postgresql/postgresql-9.1-main.log") contains "???" which
> indicate some sort of charset encoding problem.

It's a known issue, I'm afraid. The PostgreSQL postmaster logs in the
system locale, and the PostgreSQL backends log in whatever encoding
their database is in. They all write to the same log file, producing a
log file full of mixed encoding data that'll choke many text editors.

If you force your editor to re-interpret the file according to the
encoding your database(s) are in, this may help.

In the future it's possible that this may be fixed by logging output to
different files on a per-database basis or by converting the text
encoding of log messages, but no agreement has been reached on the
correct approach and nobody has stepped up to implement it.

--
Craig Ringer

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Alexander Law 2012-07-18 15:16:16 Re: main log encoding problem
Previous Message spatarel1 2012-07-18 12:39:30 BUG #6743: BETWEEN operator does not work for char(1)

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Law 2012-07-18 15:16:16 Re: main log encoding problem
Previous Message Craig Ringer 2012-07-18 14:44:47 Re: installation problem with postgres password

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Law 2012-07-18 15:16:16 Re: main log encoding problem
Previous Message Robert Haas 2012-07-18 14:22:49 Re: Event Triggers reduced, v1