Re: WIP Incremental JSON Parser

From: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP Incremental JSON Parser
Date: 2024-03-07 15:28:45
Message-ID: CAOYmi+kNPw+3hjjQarC8bm-iDa=CRmTB1TMb35zbydRLK+7+1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some more observations as I make my way through the patch:

In src/common/jsonapi.c,

> +#define JSON_NUM_NONTERMINALS 6

Should this be 5 now?

> + res = pg_parse_json_incremental(&(incstate->lex), &(incstate->sem),
> + chunk, size, is_last);
> +
> + expected = is_last ? JSON_SUCCESS : JSON_INCOMPLETE;
> +
> + if (res != expected)
> + json_manifest_parse_failure(context, "parsing failed");

This leads to error messages like

pg_verifybackup: error: could not parse backup manifest: parsing failed

which I would imagine is going to lead to confused support requests in
the event that someone does manage to corrupt their manifest. Can we
make use of json_errdetail() and print the line and column numbers?
Patch 0001 over at [1] has one approach to making json_errdetail()
workable in frontend code.

Top-level scalars like `false` or `12345` do not parse correctly if
the chunk size is too small; instead json_errdetail() reports 'Token
"" is invalid'. With small chunk sizes, json_errdetail() additionally
segfaults on constructions like `[tru]` or `12zz`.

For my local testing, I'm carrying the following diff in
001_test_json_parser_incremental.pl:

> - ok($stdout =~ /SUCCESS/, "chunk size $size: test succeeds");
> - ok(!$stderr, "chunk size $size: no error output");
> + like($stdout, qr/SUCCESS/, "chunk size $size: test succeeds");
> + is($stderr, "", "chunk size $size: no error output");

This is particularly helpful when a test fails spuriously due to code
coverage spray on stderr.

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/CAOYmi%2BmSSY4SvOtVN7zLyUCQ4-RDkxkzmTuPEN%2Bt-PsB7GHnZA%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-03-07 15:34:48 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Alvaro Herrera 2024-03-07 15:10:38 Re: Dump-restore loosing 'attnotnull' bit for DEFERRABLE PRIMARY KEY column(s).