Re: WIP Incremental JSON Parser

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP Incremental JSON Parser
Date: 2024-03-20 17:09:33
Message-ID: CAOYmi+nY=rF6dJCzaOuA3d-3FbwXCcecOs_S1NutexFA3dRXAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 19, 2024 at 3:07 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On Mon, Mar 18, 2024 at 3:35 PM Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com> wrote:
>> With the incremental parser, I think prev_token_terminator is not
>> likely to be safe to use except in very specific circumstances, since
>> it could be pointing into a stale chunk. Some documentation around how
>> to use that safely in a semantic action would be good.
>
> Quite right. It's not safe. Should we ensure it's set to something like NULL or -1?

Nulling it out seems reasonable.

> Also, where do you think we should put a warning about it?

I was thinking in the doc comment for JsonLexContext.

> It also removes the frontend exits I had. In the case of stack depth, we follow the example of the RD parser and only check stack depth for backend code. In the case of the check that the lexer is set up for incremental parsing, the exit is replaced by an Assert. That means your test for an over-nested array doesn't work any more, so I have commented it out.

Hm, okay. We really ought to fix the recursive parser, but that's for
a separate thread. (Probably OAuth.) The ideal behavior IMO would be
for the caller to configure a maximum depth in the JsonLexContext.

Note that the repalloc will eventually still exit() if the pstack gets
too big; is that a concern? Alternatively, could unbounded heap growth
be a problem for a superuser? I guess the scalars themselves aren't
capped for length...

On Wed, Mar 20, 2024 at 12:19 AM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On second thoughts, I think it might be better if we invent a new error return code for a lexer mode mismatch.

+1

--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-03-20 17:19:02 Re: Add Index-level REINDEX with multiple jobs
Previous Message Tom Lane 2024-03-20 16:44:09 Broken error detection in genbki.pl