Re: Emitting JSON to file using COPY TO

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2023-12-06 23:38:32
Message-ID: CAKFQuwaPY56pLgLsag8xsXLtacxq+4bbzdvTukSy7kY6iGAB7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Wed, Dec 6, 2023 at 4:28 PM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
wrote:

> On Wed, Dec 6, 2023 at 4:09 PM Joe Conway <mail(at)joeconway(dot)com> wrote:
>
>> On 12/6/23 14:47, Joe Conway wrote:
>> > On 12/6/23 13:59, Daniel Verite wrote:
>> >> Andrew Dunstan wrote:
>> >>
>> >>> IMNSHO, we should produce either a single JSON
>> >>> document (the ARRAY case) or a series of JSON documents, one per row
>> >>> (the LINES case).
>> >>
>> >> "COPY Operations" in the doc says:
>> >>
>> >> " The backend sends a CopyOutResponse message to the frontend, followed
>> >> by zero or more CopyData messages (always one per row), followed by
>> >> CopyDone".
>> >>
>> >> In the ARRAY case, the first messages with the copyjsontest
>> >> regression test look like this (tshark output):
>> >>
>> >> PostgreSQL
>> >> Type: CopyOut response
>> >> Length: 13
>> >> Format: Text (0)
>> >> Columns: 3
>> >> Format: Text (0)
>>
>> > Anything receiving this and looking for a json array should know how to
>> > assemble the data correctly despite the extra CopyData messages.
>>
>> Hmm, maybe the real problem here is that Columns do not equal "3" for
>> the json mode case -- that should really say "1" I think, because the
>> row is not represented as 3 columns but rather 1 json object.
>>
>> Does that sound correct?
>>
>> Assuming yes, there is still maybe an issue that there are two more
>> "rows" that actual output rows (the "[" and the "]"), but maybe those
>> are less likely to cause some hazard?
>>
>>
> What is the limitation, if any, of introducing new type codes for these.
> n = 2..N for the different variants? Or even -1 for "raw text"? And
> document that columns and structural rows need to be determined
> out-of-band. Continuing to use 1 (text) for this non-csv data seems like a
> hack even if we can technically make it function. The semantics,
> especially for the array case, are completely discarded or wrong.
>
>
Also, it seems like this answer would be easier to make if we implement
COPY FROM now since how is the server supposed to deal with decomposing
this data into tables without accurate type information? I don't see
implementing only half of the feature being a good idea. I've had much
more desire for FROM compared to TO personally.

David J.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2023-12-06 23:45:52 Re: Emitting JSON to file using COPY TO
Previous Message David G. Johnston 2023-12-06 23:28:06 Re: Emitting JSON to file using COPY TO

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2023-12-06 23:45:52 Re: Emitting JSON to file using COPY TO
Previous Message Rafael Thofehrn Castro 2023-12-06 23:33:56 Re: RFC: Logging plan of the running query