Re: Emitting JSON to file using COPY TO

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2024-01-23 05:31:00
Message-ID: CACJufxEh0QWok=XZTgFksSYSKLgkidud4cXJvJPxbTzVAsniPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Jan 19, 2024 at 4:10 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> If I'm not missing, copyto_json.007.diff is the latest patch but it
> needs to be rebased to the current HEAD. Here are random comments:
>

please check the latest version.

> if (opts_out->json_mode)
> + {
> + if (is_from)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot use JSON mode in COPY FROM")));
> + }
> + else if (opts_out->force_array)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY FORCE_ARRAY requires JSON mode")));
>
> I think that flatting these two condition make the code more readable:

I make it two condition check

> if (opts_out->json_mode && is_from)
> ereport(ERROR, ...);
>
> if (!opts_out->json_mode && opts_out->force_array)
> ereport(ERROR, ...);
>
> Also these checks can be moved close to other checks at the end of
> ProcessCopyOptions().
>
Yes. I did it, please check it.

> @@ -3395,6 +3395,10 @@ copy_opt_item:
> {
> $$ = makeDefElem("format", (Node *) makeString("csv"), @1);
> }
> + | JSON
> + {
> + $$ = makeDefElem("format", (Node *) makeString("json"), @1);
> + }
> | HEADER_P
> {
> $$ = makeDefElem("header", (Node *) makeBoolean(true), @1);
> @@ -3427,6 +3431,10 @@ copy_opt_item:
> {
> $$ = makeDefElem("encoding", (Node *) makeString($2), @1);
> }
> + | FORCE ARRAY
> + {
> + $$ = makeDefElem("force_array", (Node *)
> makeBoolean(true), @1);
> + }
> ;
>
> I believe we don't need to support new options in old-style syntax.
>
> ---
> @@ -3469,6 +3477,10 @@ copy_generic_opt_elem:
> {
> $$ = makeDefElem($1, $2, @1);
> }
> + | FORMAT_LA copy_generic_opt_arg
> + {
> + $$ = makeDefElem("format", $2, @1);
> + }
> ;
>
> I think it's not necessary. "format" option is already handled in
> copy_generic_opt_elem.
>

test it, I found out this part is necessary.
because a query with WITH like `copy (select 1) to stdout with
(format json, force_array false); ` will fail.

> ---
> +/* need delimiter to start next json array element */
> +static bool json_row_delim_needed = false;
>
> I think it's cleaner to include json_row_delim_needed into CopyToStateData.
yes. I agree. So I did it.

> ---
> Splitting the patch into two patches: add json format and add
> force_array option would make reviews easy.
>
done. one patch for json format, another one for force_array option.

I also made the following cases fail.
copy copytest to stdout (format csv, force_array false);
ERROR: specify COPY FORCE_ARRAY is only allowed in JSON mode.

If copy to table then call table_scan_getnextslot no need to worry
about the Tupdesc.
however if we copy a query output as format json, we may need to consider it.

cstate->queryDesc->tupDesc is the output of Tupdesc, we can rely on this.
for copy a query result to json, I memcpy( cstate->queryDesc->tupDesc)
to the the slot's slot->tts_tupleDescriptor
so composite_to_json can use cstate->queryDesc->tupDesc to do the work.
I guess this will make it more bullet-proof.

Attachment Content-Type Size
v8-0002-Add-force_array-for-COPY-TO-json-fomrat.patch text/x-patch 8.5 KB
v8-0001-Add-another-COPY-fomrat-json.patch text/x-patch 13.7 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Olivier Gautherot 2024-01-23 06:35:56 Re: Disk Groups/Storage Management for a Large Database in PostgreSQL
Previous Message Adrian Klaver 2024-01-23 05:25:58 Re: Backup certain months old data

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2024-01-23 05:35:48 Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Previous Message David Rowley 2024-01-23 05:10:52 Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals