Re: handling unconvertible error messages

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: handling unconvertible error messages
Date: 2016-07-27 11:53:01
Message-ID: CAMsr+YFL0b1886tMYF9RPeDdpWryG1cr8ew3pYfiXgrJofpHjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25 July 2016 at 22:43, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com
> wrote:

> Example: I have a database cluster initialized with --locale=ru_RU.UTF-8
> (built with NLS). Let's say for some reason, I have client encoding set
> to LATIN1. All error messages come back like this:
>
> test=> select * from notthere;
> ERROR: character with byte sequence 0xd0 0x9e in encoding "UTF8" has no
> equivalent in encoding "LATIN1"
>
> There is no straightforward way for the client to learn that there is a
> real error message, but it could not be converted.
>
> I think ideally we could make this better in two ways:
>
> 1) Send the original error message untranslated. That would require
> saving the original error message in errmsg(), errdetail(), etc. That
> would be a lot of work for only the occasional use. But it would also
> facilitate an occasionally-requested feature of writing untranslated
> error messages into the server log or the csv log, while sending
> translated messages to the client (or some variant thereof).
>
> 2) Send an indication that there was an encoding problem. Maybe a
> NOTICE, or an error context? Wiring all this into elog.c looks a bit
> tricky, however.
>
>
We have a similar problem with the server logs. But there there's also an
additional problem: if there isn't any character mapping issue we just
totally ignore text encoding concerns and log in whatever encoding the
client asked the backend to use into the log files. So log files can be a
line-by-line mix of UTF-8, ISO-8859-1, and whatever other fun encodings
someone asks for. There is *no* way to correctly read such a file since
lines don't have any marking as to their encoding and no tools out there
support line-by-line differently encoded text files anyway.

I'm not sure how closely it ties in to the issue you mention, but I think
it's at least related enough to keep in mind while considering the
client_encoding issue.

I suggest (3) "log the message with unmappable characters masked". Though I
would definitely like to be able to also send the raw original, along with
a field indicating the encoding of the original since it won't be the
client_encoding, since we need some way to get to the info.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-07-27 12:32:42 Re: Optimizing numeric SUM() aggregate
Previous Message Dean Rasheed 2016-07-27 09:47:16 Re: Optimizing numeric SUM() aggregate