Re: WIP Incremental JSON Parser

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP Incremental JSON Parser
Date: 2024-03-29 16:38:38
Message-ID: 1b3abed6-9014-d3ed-408f-acdf8e1e3c9a@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2024-03-26 Tu 17:52, Jacob Champion wrote:
> On Mon, Mar 25, 2024 at 3:14 PM Jacob Champion
> <jacob(dot)champion(at)enterprisedb(dot)com> wrote:
>> - add Assert calls in impossible error cases [2]
> To expand on this one, I think these parts of the code (marked with
> `<- here`) are impossible to reach:
>
>> switch (top)
>> {
>> case JSON_TOKEN_STRING:
>> if (next_prediction(pstack) == JSON_TOKEN_COLON)
>> ctx = JSON_PARSE_STRING;
>> else
>> ctx = JSON_PARSE_VALUE; <- here
>> break;
>> case JSON_TOKEN_NUMBER: <- here
>> case JSON_TOKEN_TRUE: <- here
>> case JSON_TOKEN_FALSE: <- here
>> case JSON_TOKEN_NULL: <- here
>> case JSON_TOKEN_ARRAY_START: <- here
>> case JSON_TOKEN_OBJECT_START: <- here
>> ctx = JSON_PARSE_VALUE;
>> break;
>> case JSON_TOKEN_ARRAY_END: <- here
>> ctx = JSON_PARSE_ARRAY_NEXT;
>> break;
>> case JSON_TOKEN_OBJECT_END: <- here
>> ctx = JSON_PARSE_OBJECT_NEXT;
>> break;
>> case JSON_TOKEN_COMMA: <- here
>> if (next_prediction(pstack) == JSON_TOKEN_STRING)
>> ctx = JSON_PARSE_OBJECT_NEXT;
>> else
>> ctx = JSON_PARSE_ARRAY_NEXT;
>> break;
> Since none of these cases are non-terminals, the only way to get to
> this part of the code is if (top != tok). But inspecting the
> productions and transitions that can put these tokens on the stack, it
> looks like the only way for them to be at the top of the stack to
> begin with is if (tok == top). (Otherwise, we would have chosen a
> different production, or else errored out on a non-terminal.)
>
> This case is possible...
>
>> case JSON_TOKEN_STRING:
>> if (next_prediction(pstack) == JSON_TOKEN_COLON)
>> ctx = JSON_PARSE_STRING;
> ...if we're in the middle of JSON_PROD_[MORE_]KEY_PAIRS. But the
> corresponding else case is not, because if we're in the middle of a
> _KEY_PAIRS production, the next_prediction() _must_ be
> JSON_TOKEN_COLON.
>
> Do you agree, or am I missing a way to get there?
>

One good way of testing would be to add the Asserts, build with
-DFORCE_JSON_PSTACK, and run the standard regression suite, which has a
fairly comprehensive set of JSON errors. I'll play with that.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-03-29 16:41:53 Re: Popcount optimization using AVX512
Previous Message Melanie Plageman 2024-03-29 16:32:21 Re: Combine Prune and Freeze records emitted by vacuum