Re: WIP Incremental JSON Parser

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Amit Langote <amitlangote09(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: WIP Incremental JSON Parser
Date: 2024-04-09 05:23:55
Message-ID: ZhTQ6_w1vwOhqTQI@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 09, 2024 at 09:48:18AM +0900, Michael Paquier wrote:
> There is no direct check on test_json_parser_perf.c, either, only a
> custom rule in the Makefile without specifying something for meson.
> So it looks like you could do short execution check in a TAP test, at
> least.

While reading the code, I've noticed a few things, as well:

+ /* max delicious line length is less than this */
+ char buff[6001];

Delicious applies to the contents, nothing to do with this code even
if I'd like to think that these lines of code are edible and good.
Using a hardcoded limit for some JSON input sounds like a bad idea to
me even if this is a test module. Comment that applies to both the
perf and the incremental tools. You could use a #define'd buffer size
for readability rather than assuming this value in many places.

+++ b/src/test/modules/test_json_parser/test_json_parser_incremental.c
+ * This progam tests incremental parsing of json. The input is fed into
+ * full range of incement handling, especially in the lexer, is exercised.
+++ b/src/test/modules/test_json_parser/test_json_parser_perf.c
+ * Performancet est program for both flavors of the JSON parser
+ * This progam tests either the standard (recursive descent) JSON parser
+++ b/src/test/modules/test_json_parser/README
+ reads in a file and pases it in very small chunks (60 bytes at a time) to

Collection of typos across various files.

+ appendStringInfoString(&json, "1+23 trailing junk");
What's the purpose here? Perhaps the intention should be documented
in a comment?

At the end, having a way to generate JSON blobs randomly to test this
stuff would be more appealing than what you have currently, with
perhaps a few factors like:
- Array and simple object density.
- Max Level of nesting.
- Limitation to ASCII characters that can be escaped.
- Perhaps more things I cannot think about?

So the current state of things is kind of disappointing, and the size
of the data set added to the tree is not that portable either if you
want to test various scenarios depending on the data set. It seems to
me that this has been committed too hastily and that this is not ready
for integration, even if that's just a test module. Tom also has
shared some concerns upthread, as far as I can see.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-04-09 05:37:31 Re: post-freeze damage control
Previous Message Thomas Munro 2024-04-09 05:13:52 Re: Speed up clean meson builds by ~25%