Re: making the backend's json parser work in frontend code

From: David Steele <david(at)pgmasters(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making the backend's json parser work in frontend code
Date: 2020-01-16 18:37:12
Message-ID: e39320c8-cfd2-9b10-255d-b3f2ffc2caaf@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Robert,

On 1/15/20 2:02 PM, Robert Haas wrote:
> The discussion on the backup manifest thread has gotten bogged down on
> the issue of the format that should be used to store the backup
> manifest file. I want something simple and ad-hoc; David Steele and
> Stephen Frost prefer JSON. That is problematic because our JSON parser
> does not work in frontend code, and I want to be able to validate a
> backup against its manifest, which involves being able to parse the
> manifest from frontend code. The latest development over there is that
> David Steele has posted the JSON parser that he wrote for pgbackrest
> with an offer to try to adapt it for use in front-end PostgreSQL code,
> an offer which I genuinely appreciate. I'll write more about that over
> on that thread. However, I decided to spend today doing some further
> investigation of an alternative approach, namely making the backend's
> existing JSON parser work in frontend code as well. I did not solve
> all the problems there, but I did come up with some patches which I
> think would be worth committing on independent grounds, and I think
> the whole series is worth posting. So here goes.

I was starting to wonder if it wouldn't be simpler to go back to the
Postgres JSON parser and see if we can adapt it. I'm not sure that it
*is* simpler, but it would almost certainly be more acceptable.

> 0001 moves wchar.c from src/backend/utils/mb to src/common. Unless I'm
> missing something, this seems like an overdue cleanup. It's long been
> the case that wchar.c is actually compiled and linked into both
> frontend and backend code. Commit
> 60f11b87a2349985230c08616fa8a34ffde934c8 added code into src/common
> that depends on wchar.c being available, but didn't actually make
> wchar.c part of src/common, which seems like an odd decision: the
> functions in the library are dependent on code that is not part of any
> library but whose source files get copied around where needed. Eh?

This looks like an obvious improvement to me.

> 0002 does some basic header cleanup to make it possible to include the
> existing header file jsonapi.h in frontend code. The state of the JSON
> headers today looks generally poor. There seems not to have been much
> attempt to get the prototypes for a given source file, say foo.c, into
> a header file with the same name, say foo.h. Also, dependencies
> between various header files seem to be have added somewhat freely.
> This patch does not come close to fixing all that, but I consider it a
> modest down payment on a cleanup that probably ought to be taken
> further.

Agreed that these header files are fairly disorganized. In general the
names json, jsonapi, jsonfuncs don't tell me a whole lot. I feel like
I'd want to include json.h to get a json parser but it only contains one
utility function before these patches. I can see that json.c primarily
contains SQL functions so that's why.

So the idea here is that json.c will have the JSON SQL functions,
jsonb.c the JSONB SQL functions, and jsonapi.c the parser, and
jsonfuncs.c the utility functions?

> 0003 splits json.c into two files, json.c and jsonapi.c. All the
> lexing and parsing stuff (whose prototypes are in jsonapi.h) goes into
> jsonapi.c, while the stuff that pertains to the 'json' data type
> remains in json.c. This also seems like a good cleanup, because to me,
> at least, it's not a great idea to mix together code that is used by
> both the json and jsonb data types as well as other things in the
> system that want to generate or parse json together with things that
> are specific to the 'json' data type.

This seems like a good first step. I wonder if the remainder of the SQL
json/jsonb functions should be moved to json.c/jsonb.c respectively?

That does represent a lot of code churn though, so perhaps not worth it.

> As far as I know all three of the above patches are committable as-is;
> review and contrary opinions welcome.

Agreed, with some questions as above.

> On the other hand, 0004, 0005, and 0006 are charitably described as
> experimental or WIP. 0004 and 0005 hack up jsonapi.c so that it can
> still be compiled even if #include "postgres.h" is changed to #include
> "postgres-fe.h" and 0006 moves it into src/common. Note that I say
> that they make it compile, not work. It's not just untested; it's
> definitely broken. But it gives a feeling for what the remaining
> obstacles to making this code available in a frontend environment are.
> Since I wrote my very first email complaining about the difficulty of
> making the backend's JSON parser work in a frontend environment, one
> obstacle has been knocked down: StringInfo is now available in
> front-end code (commit 26aaf97b683d6258c098859e6b1268e1f5da242f). The
> remaining problems (that I know about) have to do with error reporting
> and multibyte character support; a read of the patches is suggested
> for those wanting further details.

Well, with the caveat that it doesn't work, it's less than I expected.

Obviously ereport() is a pretty big deal and I agree with Michael
downthread that we should port this to the frontend code.

It would also be nice to unify functions like PQmblen() and pg_mblen()
if possible.

The next question in my mind is given the caveat that the error handing
is questionable in the front end, can we at least render/parse valid
JSON with the code?

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2020-01-16 18:39:31 Re: making the backend's json parser work in frontend code
Previous Message Andres Freund 2020-01-16 18:08:48 Re: SlabCheck leaks memory into TopMemoryContext