Re: Internationalized error messages

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Internationalized error messages
Date: 2001-03-09 02:00:09
Message-ID: 4741.984103209@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Mar 08, 2001 at 11:49:50PM +0100, Peter Eisentraut wrote:
>> I really feel that translated error messages need to happen soon.

Agreed.

ncm(at)zembu(dot)com (Nathan Myers) writes:
> Similar approaches have been tried frequently, and even enshrined
> in standards (e.g. POSIX catgets), but have almost always proven too
> cumbersome. The problem is that keeping programs that interpret the
> numeric code in sync with the program they monitor is hard, and trying
> to avoid breaking all those secondary programs hinders development on
> the primary program. Furthermore, assigning code numbers is a nuisance,
> and they add uninformative clutter.

There's a difficult tradeoff to make here, but I think we do want to
distinguish between the "official error code" --- the thing that has
translations into various languages --- and what the backend is actually
allowed to print out. It seems to me that a fairly large fraction of
the unique messages found in the backend can all be lumped under the
category of "internal error", and that we need to have only one official
error code and one user-level translated message for the lot of them.
But we do want to be able to print out different detail messages for
each of those internal errors. There are other categories that might be
lumped together, but that one alone is sufficiently large to force us
to recognize it. This suggests a distinction between a "primary" or
"user-level" error message, which we catalog and provide translations
for, and a "secondary", "detail", or "wizard-level" error message that
exists only in the backend source code, and only in English, and so
can be made up on the spur of the moment.

Another thing that's bothered me for a long time is our inconsistent
approach to determining where in the code a message comes from. A lot
of the messages currently embed the name of the generating routine right
into the error text. Again, we ought to separate the functionality:
the source-code location is valuable but ought not form part of the
primary error message. I would like to see elog() become a macro that
invokes __FILE__ and __LINE__ to automatically make the *exact* source
code location become part of the secondary error information, and then
drop the convention of using the routine name in the message text.

Something else we have talked about off-and-on is providing locator
information for errors that can be associated with a particular point in
the query string (lexical and syntactic errors). This would probably be
best returned as a character index.

Another thing that I missed in Peter's proposal is how we are going to
cope with messages that include parameters. Surely we do not expect
gettext to start with 'Attribute "foo" not found' and distinguish fixed
from variable parts of that string?

So it's clear that we need to devise a way of breaking an "error
message" into multiple portions, including:

Primary error message (localizable)
Parameters to insert into error message (user identifiers, etc)
Secondary (wizard) error message (optional)
Source code location
Query text location (optional)

and perhaps others that I have forgotten about. One of the key things
to think about is whether we can, or should try to, transmit all this
stuff in a backwards-compatible protocol. That would mean we'd have
to dump all the info into a single string, which is doable but would
perhaps look pretty ugly:

ERROR: Attribute "foo" not found -- basic message for dumb frontends
ERRORCODE: UNREC_IDENT -- key for finding localized message
PARAM1: foo -- something to embed in the localized message
MESSAGE: Attribute or table name not known within context of query
CODELOC: src/backend/parser/parse_clause.c line 345
QUERYLOC: 22

Alternatively we could suppress most of this stuff unless the frontend
specifically asks for it (and presumably is willing to digest it for
the user).

Bottom line for me is that if we are going to go to the trouble of
examining and changing every single elog() in the system, we should
try to get all of these issues cleaned up at once. Let's not have to
go back and do it again later.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hiroshi Inoue 2001-03-09 02:23:58 Re: How to handle waitingForLock in LockWaitCancel()
Previous Message Nathan Myers 2001-03-09 00:42:22 Re: Internationalized error messages