Re: WIP Incremental JSON Parser

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP Incremental JSON Parser
Date: 2024-02-26 15:08:18
Message-ID: CAOYmi+nHV55Uhz+o-HKq0GNiWn2L5gMcuyRQEz_fqpGY=pFxKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 22, 2024 at 3:43 PM Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> > Are there plans to fill out the test suite more? Since we should be
> > able to control all the initial conditions, it'd be good to get fairly
> > comprehensive coverage of the new code.
>
> Well, it's tested (as we know) by the backup manifest tests. During
> development, I tested by making the regular parser use the
> non-recursive parser (see FORCE_JSON_PSTACK). That doesn't test the
> incremental piece of it, but it does check that the rest of it is doing
> the right thing. We could probably extend the incremental test by making
> it output a stream of tokens and making sure that was correct.

That would also cover all the semantic callbacks (currently,
OFIELD_END and AELEM_* are uncovered), so +1 from me.

Looking at lcov, it'd be good to
- test failure modes as well as the happy path, so we know we're
rejecting invalid syntax correctly
- test the prediction stack resizing code
- provide targeted coverage of the partial token support, since at the
moment we're at the mercy of the manifest format and the default chunk
size

As a brute force example of the latter, with the attached diff I get
test failures at chunk sizes 1, 2, 3, 4, 6, and 12.

> > As an aside, I find the behavior of need_escapes=false to be
> > completely counterintuitive. I know the previous code did this, but it
> > seems like the opposite of "provides unescaped strings" should be
> > "provides raw strings", not "all strings are now NULL".
>
> Yes, we could possibly call it "need_strings" or something like that.

+1

--Jacob

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-02-26 15:10:08 Re: WIP Incremental JSON Parser
Previous Message Christoph Berg 2024-02-26 15:05:05 Re: Printing backtrace of postgres processes