Re: psql JSON output format

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: psql JSON output format
Date: 2024-01-17 09:30:43
Message-ID: fd88f7a43cc86a9c2b994ac3aa97101a7690c41a.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2024-01-16 at 14:12 -0500, Robert Haas wrote:
> On Tue, Jan 16, 2024 at 11:07 AM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > "Round-trip safety" is not so important. If you want to move data from
> > PostgreSQL to PostgreSQL, you use the plain or the binary format.
> > The CSV format by default renders NULL and empty strings identical, and
> > I don't think anybody objects to that.
>
> As Andrew says, the part about the CSV format is not correct, but I
> also don't think I agree with the larger point, either. I believe that
> round-trip safety is a really desirable property. Is it absolutely
> necessary in every case? Maybe not. But, it shouldn't be lacking
> without a good reason, either, at least IMHO. If you postulate that
> people are moving data from A to B, it is reasonable to think that
> eventually someone is going to want to move some data from B back to
> A. If that turns out to be hard, they'll be sad. We shouldn't make
> people sad without a good reason.

As mentioned in my other mail, I was talking about the psql output
format "csv" rather than about COPY.

I agree that it is desirable to lose as little information as possible.
But if we want to format query output as JSON, we have a couple of
requirements that cannot all be satisfied:

1. lose no information ("round-trip safe")

2. don't double quote numbers, booleans and other JSON values

3. don't skip any table column in the output

Christoph's original patch didn't satisfy #2, and his current version
doesn't satisfy #1. Do you think that skipping NULL columns would be
the best solution? We don't do that in the to_json() function, which
also renders SQL NULL as JSON null.

I think the argument for round-trip safety of psql output is tenuous.
There is no way for psql to ingest JSON as input format, and the patch
to add JSON as COPY format only supports COPY TO. And unless you can
name the exact way that the data written by psql will be loaded into
PostgreSQL again, all that remains is an (understandable) unease about
losing the distiction between SQL NULL and JSON null.

We have jsonb_populate_record() to convert JSON back to a table row,
but that function will convert both missing columns and a JSON null
to SQL NULL:

CREATE TABLE xy (id integer, j jsonb);

\pset null '∅'

SELECT * FROM jsonb_populate_record(NULL::xy, '{"id":1,"j":null}');

id │ j
════╪═══
1 │ ∅
(1 row)

SELECT * FROM jsonb_populate_record(NULL::xy, '{"id":1}');

id │ j
════╪═══
1 │ ∅
(1 row)

Indeed, there doesn't seem to be a way to generate JSON null with that
function.

So I wouldn't worry about round-trip safety too much, and my preference
is how the current patch does it. I am not dead set against a solution
that omits NULL columns in the output, though.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2024-01-17 09:33:34 Re: make pg_ctl more friendly
Previous Message Anthonin Bonnefoy 2024-01-17 09:05:33 Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bindx