WIP Incremental JSON Parser

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: WIP Incremental JSON Parser
Date: 2023-12-26 16:48:25
Message-ID: 7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Quite a long time ago Robert asked me about the possibility of an
incremental JSON parser. I wrote one, and I've tweaked it a bit, but the
performance is significantly worse that that of the current Recursive
Descent parser. Nevertheless, I'm attaching my current WIP state for it,
and I'll add it to the next CF to keep the conversation going.

One possible use would be in parsing large manifest files for
incremental backup. However, it struck me a few days ago that this might
not work all that well. The current parser and the new parser both
palloc() space for each field name and scalar token in the JSON (unless
they aren't used, which is normally not the case), and they don't free
it, so that particularly if done in frontend code this amounts to a
possible memory leak, unless the semantic routines do the freeing
themselves. So while we can save some memory by not having to slurp in
the whole JSON in one hit, we aren't saving any of that other allocation
of memory, which amounts to almost as much space as the raw JSON.

In any case, I've had fun so it's not a total loss come what may :-)

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment Content-Type Size
json-incremental-parser-2023-12-26.patch text/x-patch 414.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-12-26 17:28:43 Re: Add the ability to limit the amount of memory that can be allocated to backends.
Previous Message Tom Lane 2023-12-26 16:36:41 Re: A tiny improvement of psql