Re: Proposal to use JSON for Postgres Parser format

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal to use JSON for Postgres Parser format
Date: 2022-09-20 10:00:36
Message-ID: CAPpHfduet9MWvMD6mDRX7KsU8L2QSrCsCvkmynoP6nciNNGk5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 20, 2022 at 7:48 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > On Mon, Sep 19, 2022 at 8:39 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Our existing format is certainly not great on those metrics, but
> >> I do not see how "let's use JSON!" is a route to improvement.
>
> > The existing format was designed with developer convenience as a goal,
> > though -- despite my complaints, and in spite of your objections.
>
> As Munro adduces nearby, it'd be a stretch to conclude that the current
> format was designed with any Postgres-related goals in mind at all.
> I think he's right that it's a variant of some Lisp-y dump format that's
> probably far hoarier than even Berkeley Postgres.
>
> > If it didn't have to be easy (or even practical) for developers to
> > directly work with the output format, then presumably the format used
> > internally could be replaced with something lower level and faster. So
> > it seems like the two goals (developer ergonomics and faster
> > interchange format for users) might actually be complementary.
>
> I think the principal mistake in what we have now is that the storage
> format is identical to the "developer friendly" text format (plus or
> minus some whitespace). First we need to separate those. We could
> have more than one equivalent text format perhaps, and I don't have
> any strong objection to basing the text format (or one of them) on
> JSON.

+1 for considering storage format and text format separately.

Let's consider what our criteria could be for the storage format.

1) Storage effectiveness (shorter is better) and
serialization/deserialization effectiveness (faster is better). On
this criterion, the custom binary format looks perfect.
2) Robustness in the case of corruption. It seems much easier to
detect the data corruption and possibly make some partial manual
recovery for textual format.
3) Standartness. It's better to use something known worldwide or at
least used in other parts of PostgreSQL than something completely
custom. From this perspective, JSON/JSONB is better than custom
things.

------
Regards,
Alexander Korotkov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2022-09-20 10:02:07 Re: Proposal to use JSON for Postgres Parser format
Previous Message Peter Eisentraut 2022-09-20 09:59:04 Re: ICU for global collation