Quick Links

Re: Emitting JSON to file using COPY TO

From:	jian he <jian(dot)universality(at)gmail(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Florents Tselai <florents(dot)tselai(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Emitting JSON to file using COPY TO
Date:	2026-03-08 16:16:08
Message-ID:	CACJufxFFZqxC3p4WjpTEi4riaJm=pADX+py0yQ0=RWTn5cqK3Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

hi.

V27-0002 is still not bullet-proof.

drop table if exists t1;
create table t1(a int);
insert into t1 values (1);
copy (select * from t1) to stdout json;
{"a":1}
WARNING: resource was not closed: TupleDesc 0x7171d0ca3440 (18239,-1)

Also see ExecAssignScanProjectionInfo->ExecConditionalAssignProjectionInfo
So in v28-0002, I changed to
+ /*
+ * composite_to_json() requires a stable TupleDesc. Since the slot's
+ * descriptor (slot->tts_tupleDescriptor) can change during the execution
+ * of a SELECT query, we use cstate->queryDesc->tupDesc instead. This
+ * precaution is only necessary when the output slot's TupleDesc is of
+ * type RECORDOID.
+ */
+ if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid == RECORDOID)
+ slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc;

+ cstate->json_projvalues = (Datum *) palloc(natts * sizeof(Datum));
+ cstate->json_projnulls = (bool *) palloc(natts * sizeof(bool));
I changed it to
+ cstate->json_projvalues = palloc_array(Datum, natts);
+ cstate->json_projnulls = palloc_array(bool, natts);

+ rowdata = HeapTupleHeaderGetDatum(tup->t_data);
I changed it to
+ rowdata = HeapTupleGetDatum(tup);

Patch v28-0004 adds the json_projvalues and json_projnulls pointers to struct
CopyToStateData. I wondered if adding these would slow the COPY TO with TEXT and
CSV format, so I ran a quick test using a 36-column table.

Surprisingly, v28 actually make COPY TO with TEXT and CSV performs a little bit
faster. But I didn't find out why.
You may also try the attached test script: copyto_json_perfomance_test.nocfbot.

--
jian
https://www.enterprisedb.com/

Attachment	Content-Type	Size
v28-0002-json-format-for-COPY-TO.patch	text/x-patch	23.7 KB
v28-0001-introduce-CopyFormat-refactor-CopyFormatOptions.patch	text/x-patch	13.1 KB
v28-0003-Add-option-force_array-for-COPY-JSON-FORMAT.patch	text/x-patch	12.3 KB
v28-0004-COPY-TO-JSON-support-column-lists.patch	text/x-patch	12.7 KB
copyto_json_perfomance_test.nocfbot	application/octet-stream	1.7 KB

In response to

Re: Emitting JSON to file using COPY TO at 2026-03-06 12:33:31 from Andrew Dunstan

Responses

Re: Emitting JSON to file using COPY TO at 2026-03-08 19:44:51 from Andrew Dunstan

Browse pgsql-general by date

	From	Date	Subject
Next Message	Shaheed Haque	2026-03-08 16:19:22	Re: Unexpected deadlock across two separate rows, using Postgres 17 and Django's select_for_update()
Previous Message	felix.quintgz	2026-03-08 15:15:34	Re: Unexpected deadlock across two separate rows, using Postgres 17 and Django's select_for_update()

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2026-03-08 16:39:47	Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?
Previous Message	Alexandre Felipe	2026-03-08 16:09:07	Addressing buffer private reference count scalability issue