Re: Emitting JSON to file using COPY TO

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2023-12-06 23:28:06
Message-ID: CAKFQuwYS4HwypV3iKqx+GdsMU+-k-VTWt4bu6vc5nYqcAMgkcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Dec 6, 2023 at 4:09 PM Joe Conway <mail(at)joeconway(dot)com> wrote:

> On 12/6/23 14:47, Joe Conway wrote:
> > On 12/6/23 13:59, Daniel Verite wrote:
> >> Andrew Dunstan wrote:
> >>
> >>> IMNSHO, we should produce either a single JSON
> >>> document (the ARRAY case) or a series of JSON documents, one per row
> >>> (the LINES case).
> >>
> >> "COPY Operations" in the doc says:
> >>
> >> " The backend sends a CopyOutResponse message to the frontend, followed
> >> by zero or more CopyData messages (always one per row), followed by
> >> CopyDone".
> >>
> >> In the ARRAY case, the first messages with the copyjsontest
> >> regression test look like this (tshark output):
> >>
> >> PostgreSQL
> >> Type: CopyOut response
> >> Length: 13
> >> Format: Text (0)
> >> Columns: 3
> >> Format: Text (0)
>
> > Anything receiving this and looking for a json array should know how to
> > assemble the data correctly despite the extra CopyData messages.
>
> Hmm, maybe the real problem here is that Columns do not equal "3" for
> the json mode case -- that should really say "1" I think, because the
> row is not represented as 3 columns but rather 1 json object.
>
> Does that sound correct?
>
> Assuming yes, there is still maybe an issue that there are two more
> "rows" that actual output rows (the "[" and the "]"), but maybe those
> are less likely to cause some hazard?
>
>
What is the limitation, if any, of introducing new type codes for these. n
= 2..N for the different variants? Or even -1 for "raw text"? And
document that columns and structural rows need to be determined
out-of-band. Continuing to use 1 (text) for this non-csv data seems like a
hack even if we can technically make it function. The semantics,
especially for the array case, are completely discarded or wrong.

David J.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2023-12-06 23:38:32 Re: Emitting JSON to file using COPY TO
Previous Message roger popa 2023-12-06 23:25:25 Re: Trainning and Certification

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-12-06 23:32:45 Re: PATCH: Add REINDEX tag to event triggers
Previous Message Nathan Bossart 2023-12-06 23:20:18 Re: Clean up some signal usage mainly related to Windows