Proposal to use JSON for Postgres Parser format

From: Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal to use JSON for Postgres Parser format
Date: 2022-09-20 00:15:54
Message-ID: CACxu=vL_SD=WJiFSJyyBuZAp_2v_XBqb1x9JBiqz52a_g9z3jA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

As noted in the source:

https://github.com/postgres/postgres/blob/master/src/include/nodes/pg_list.h#L6-L11

* Once upon a time, parts of Postgres were written in Lisp and used real
* cons-cell lists for major data structures. When that code was rewritten
* in C, we initially had a faithful emulation of cons-cell lists, which
* unsurprisingly was a performance bottleneck. A couple of major rewrites
* later, these data structures are actually simple expansible arrays;
* but the "List" name and a lot of the notation survives.

The Postgres parser format as described in the wiki page:

https://wiki.postgresql.org/wiki/Query_Parsing

looks almost, but not quite, entirely like JSON:

SELECT * FROM foo where bar = 42 ORDER BY id DESC LIMIT 23;
(
{SELECT
:distinctClause <>
:intoClause <>
:targetList (
{RESTARGET
:name <>
:indirection <>
:val
{COLUMNREF
:fields (
{A_STAR
}
)
:location 7
}
:location 7
}
)
:fromClause (
{RANGEVAR
:schemaname <>
:relname foo
:inhOpt 2
:relpersistence p
:alias <>
:location 14
}
)
... and so on
)

This non-standard format is useful for visual inspection and perhaps
simple parsing. Parsers that do exist for it are generally specific
to some languages. If there were a standard way to parse queries,
tools like code generators and analysis tools can work with a variety
of libraries that already handle JSON quite well. Future potential
would include exposing this data to command_ddl_start event triggers.
Providing a JSON Schema would also aid tools that want to validate or
transform the json with rule based systems.

I would like to propose a discussion that in a future major release
Postgres switch
from this custom format to JSON. The current format is question is
generated from macros and functions found in
`src/backend/nodes/readfuncs.c` and `src/backend/nodes/outfuncs.c` and
converting them to emit valid JSON would be relatively
straightforward.

One downside would be that this would not be a forward compatible
binary change across releases. Since it is unlikely that very much
code is reliant on this custom format; this would not be a huge problem
for most.

Thoughts?

-Michel

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-09-20 00:21:28 Re: Support pg_attribute_aligned and noreturn in MSVC
Previous Message Tom Lane 2022-09-19 23:06:53 Re: Silencing the remaining clang 15 warnings