Skip site navigation (1) Skip section navigation (2)

Re: main log encoding problem

From: Alexander Law <exclusion(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: pgsql-general(at)postgresql(dot)org, ringerc(at)ringerc(dot)id(dot)au, yi(dot)codeplayer(at)gmail(dot)com, pgsql-bugs(at)postgresql(dot)org
Subject: Re: main log encoding problem
Date: 2012-07-19 08:21:45
Message-ID: 5007C399.6000405@gmail.com (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-generalpgsql-hackers
>> And regarding mule internal encoding - reading about Mule
>> http://www.emacswiki.org/emacs/UnicodeEncoding I found:
>> /In future (probably Emacs 22), Mule will use an internal encoding
>> which is a UTF-8 encoding of a superset of Unicode. /
>> So I still see UTF-8 as a common denominator for all the encodings.
>> I am not aware of any characters absent in Unicode. Can you please
>> provide some examples of these that can results in lossy conversion?
> You can google by "encoding "EUC_JP" has no equivalent in "UTF8"" or
> some such to find such an example. In this case PostgreSQL just throw
> an error. For frontend/backend encoding conversion this is fine. But
> what should we do for logs? Apparently we cannot throw an error here.
>
> "Unification" is another problem. Some kanji characters of CJK are
> "unified" in Unicode. The idea of unification is, if kanji A in China,
> B in Japan, C in Korea looks "similar" unify ABC to D. This is a great
> space saving:-) The price of this is inablity of
> round-trip-conversion. You can convert A, B or C to D, but you cannot
> convert D to A/B/C.
>
> BTW, I'm not stick with mule-internal encoding. What we need here is a
> "super" encoding which could include any existing encodings without
> information loss. For this purpose, I think we can even invent a new
> encoding(maybe something like very first prposal of ISO/IEC
> 10646?). However, using UTF-8 for this purpose seems to be just a
> disaster to me.
>
Ok, maybe the time of real universal encoding has not yet come. Then we 
maybe just should add a new parameter "log_encoding" (UTF-8 by default) 
to postgresql.conf. And to use this encoding consistently within 
logging_collector.
If this encoding is not available then fall back to 7-bit ASCII.


In response to

Responses

pgsql-hackers by date

Next:From: Tatsuo IshiiDate: 2012-07-19 08:28:05
Subject: Re: main log encoding problem
Previous:From: Alexander LawDate: 2012-07-19 08:12:17
Subject: Re: main log encoding problem

pgsql-bugs by date

Next:From: Tatsuo IshiiDate: 2012-07-19 08:28:05
Subject: Re: main log encoding problem
Previous:From: Alexander LawDate: 2012-07-19 08:12:17
Subject: Re: main log encoding problem

pgsql-general by date

Next:From: Tatsuo IshiiDate: 2012-07-19 08:28:05
Subject: Re: main log encoding problem
Previous:From: Alexander LawDate: 2012-07-19 08:12:17
Subject: Re: main log encoding problem

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group