Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
Date: 2012-06-13 01:52:56
Message-ID: 28472.1339552376@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

I wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> I am not sure about the idea of letting the detail run to the end of the
>> line; that would be problematic should the line be long (there might not
>> be newlines in the literal at all, which is not that unusual). I think
>> it should be truncated at, say, 76 chars or so.

> Yeah, I was wondering about trying to provide a given amount of context
> instead of fixing it to "one line". We could do something like
> (1) back up N characters;
> (2) find the next newline, if there is one at least M characters before
> the error point;
> (3) print from there to the error point.

After experimenting with this for awhile I concluded that the above is
overcomplicated, and that we might as well just print up to N characters
of context; in most input, the line breaks are far enough apart that
preferentially breaking at them just leads to not having very much
context. Also, it seems like it might be a good idea to present the
input as a CONTEXT line, because that provides more space; you can fit
50 or so characters of data without overrunning standard display width.
This gives me output like

regression=# select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0,"twenty":0,"hundred":0,"thousand":800,
"twothous
and":800,"fivethous":3800,"tenthous":8800,"odd":0,"even":1,
"stringu1":"MAAAAA","stringu2":"AAAAAA","string4":"AAAAxx"}'
::json;
ERROR: invalid input syntax for type json
LINE 1: select '{"unique1":8800,"unique2":0,"two":0,"four":0,"ten":0...
^
DETAIL: Character with value "0x0a" must be escaped.
CONTEXT: JSON data, line 1: ..."twenty":0,"hundred":0,"thousand":800,
"twothous

regression=#

I can't give too many examples because I've only bothered to context-ify
this single error case as yet ;-) ... but does this seem like a sane way
to go?

The code for this is as attached. Note that I'd rip out the normal-path
tracking of line boundaries; it seems better to have a second scan of
the data in the error case and save the cycles in non-error cases.

...
ereport(ERROR,
(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
errmsg("invalid input syntax for type json"),
errdetail("Character with value \"0x%02x\" must be escaped.",
(unsigned char) *s),
report_json_context(lex)));
...

/*
* Report a CONTEXT line for bogus JSON input.
*
* The return value isn't meaningful, but we make it non-void so that this
* can be invoked inside ereport().
*/
static int
report_json_context(JsonLexContext *lex)
{
char *context_start;
char *context_end;
char *line_start;
int line_number;
char *ctxt;
int ctxtlen;

/* Choose boundaries for the part of the input we will display */
context_start = lex->input;
context_end = lex->token_terminator;
line_start = context_start;
line_number = 1;
for (;;)
{
/* Always advance over a newline, unless it's the current token */
if (*context_start == '\n' && context_start < lex->token_start)
{
context_start++;
line_start = context_start;
line_number++;
continue;
}
/* Otherwise, done as soon as we are close enough to context_end */
if (context_end - context_start < 50)
break;
/* Advance to next multibyte character */
if (IS_HIGHBIT_SET(*context_start))
context_start += pg_mblen(context_start);
else
context_start++;
}

/* Get a null-terminated copy of the data to present */
ctxtlen = context_end - context_start;
ctxt = palloc(ctxtlen + 1);
memcpy(ctxt, context_start, ctxtlen);
ctxt[ctxtlen] = '\0';

return errcontext("JSON data, line %d: %s%s",
line_number,
(context_start > line_start) ? "..." : "",
ctxt);
}

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2012-06-13 02:07:44 Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.
Previous Message Tom Lane 2012-06-12 21:26:46 Re: [COMMITTERS] pgsql: Mark JSON error detail messages for translation.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-13 02:02:35 Re: 9.3: load path to mitigate load penalty for checksums
Previous Message Tatsuo Ishii 2012-06-13 01:45:25 Re: pg_dump and thousands of schemas