Re: Bug in error reporting for multi-line JSON

From: Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Bug in error reporting for multi-line JSON
Date: 2021-01-26 09:07:09
Message-ID: CANugjhuR8U+uzSptTZ7tEUGvCt8vGnSz89rRw+89yd+wx_nLvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Jan 25, 2021 at 6:08 PM Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
wrote:

> On Thu, Jan 21, 2021 at 6:08 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com> writes:
> > > JSON parsing reports the line number and relevant context info
> > > incorrectly when the JSON contains newlines. Current code mostly just
> > > says "LINE 1" and is misleading for error correction. There were no
> > > tests for this previously.
> >
> > Couple thoughts:
> >
> > * I think you are wrong to have removed the line number bump that
> > happened when report_json_context advances context_start over a
> > newline. The case is likely harder to get to now, but it can still
> > happen can't it? If it can't, we should remove that whole stanza.
>
> OK, I'm playing around with this to see what is needed.
>
> > * I'd suggest naming the new JsonLexContext field "pos_last_newline";
> > "linefeed" is not usually the word we use for this concept. (Although
> > actually, it might work better if you make that point to the char
> > *after* the newline, in which case "last_linestart" might be the
> > right name.)
>
> Yes, OK
>
> > * I'm not enthused about back-patching. This behavior seems like an
> > improvement, but that doesn't mean people will appreciate changing it
> > in stable branches.
>
> OK, as you wish.
>
> Thanks for the review, will post again soon with an updated patch.
>

I agree with Tom's feedback.. Whether you change pos_last_linefeed to point
to the character after the linefeed or not, we can still simplify the for
loop within the "report_json_context" function to:

=================
context_start = lex->input + lex->pos_last_linefeed;
context_start += (*context_start == '\n'); /* Let's move beyond the
linefeed */
context_end = lex->token_terminator;
line_start = context_start;
while (context_end - context_start >= 50 && context_start < context_end)
{
/* Advance to next multibyte character */
if (IS_HIGHBIT_SET(*context_start))
context_start += pg_mblen(context_start);
else
context_start++;
}
=================

IMHO, this should work as pos_last_linefeed points to the position of the
last linefeed before the error occurred, hence we can safely skip it and
move the start_context forward.

>
> --
> Simon Riggs http://www.EnterpriseDB.com/
>
>
>

--
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
ADDR: 10318 WHALLEY BLVD, Surrey, BC
CELL:+923335449950 EMAIL: mailto:hamid(dot)akhtar(at)highgo(dot)ca
SKYPE: engineeredvirus

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-01-26 11:17:09 Re: BUG #16837: Invalid memory access on \h in psql
Previous Message Thomas Munro 2021-01-26 08:26:03 Re: BUG #16827: macOS interrupted syscall leads to a crash

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2021-01-26 09:19:19 Re: FETCH FIRST clause PERCENT option
Previous Message Heikki Linnakangas 2021-01-26 09:00:56 Re: shared tempfile was not removed on statement_timeout