Re: Emitting JSON to file using COPY TO

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Florents Tselai <florents(dot)tselai(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2026-03-08 19:44:51
Message-ID: f137f5d7-bae7-4145-8c76-e83d2de13d59@dunslane.net
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers


On 2026-03-08 Su 12:16 PM, jian he wrote:
> hi.
>
> V27-0002 is still not bullet-proof.
>
> drop table if exists t1;
> create table t1(a int);
> insert into t1 values (1);
> copy (select * from t1) to stdout json;
> {"a":1}
> WARNING: resource was not closed: TupleDesc 0x7171d0ca3440 (18239,-1)
>
> Also see ExecAssignScanProjectionInfo->ExecConditionalAssignProjectionInfo
> So in v28-0002, I changed to
> + /*
> + * composite_to_json() requires a stable TupleDesc. Since the slot's
> + * descriptor (slot->tts_tupleDescriptor) can change during the execution
> + * of a SELECT query, we use cstate->queryDesc->tupDesc instead. This
> + * precaution is only necessary when the output slot's TupleDesc is of
> + * type RECORDOID.
> + */
> + if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid == RECORDOID)
> + slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc;

Hmm. But should we be scribbling on slot->tts_tupleDescriptor like that?
How about something like this?:

-        * Full table or query without column list.  Ensure the slot uses
-        * cstate->tupDesc so that the datum is stamped with the right type;
-        * for queries output type is RECORDOID this must be the blessed
-        * descriptor so that composite_to_json can look it up via
-        * lookup_rowtype_tupdesc.
+        * Full table or query without column list.  For queries, the slot's
+        * TupleDesc may carry RECORDOID, which is not registered in the
type
+        * cache and would cause composite_to_json's lookup_rowtype_tupdesc
+        * call to fail.  Build a HeapTuple stamped with the blessed
+        * descriptor so the type can be looked up correctly.
         */
        if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid ==
RECORDOID)
-           slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc;
+       {
+           HeapTuple   tup;

-       rowdata = ExecFetchSlotHeapTupleDatum(slot);
+           tup = heap_form_tuple(cstate->tupDesc,
+                                 slot->tts_values,
+                                 slot->tts_isnull);
+           rowdata = HeapTupleGetDatum(tup);
+       }
+       else
+       {
+           rowdata = ExecFetchSlotHeapTupleDatum(slot);
+       }

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Greg Sabino Mullane 2026-03-08 22:08:01 Re: Unexpected deadlock across two separate rows, using Postgres 17 and Django's select_for_update()
Previous Message felix.quintgz 2026-03-08 18:23:27 Re: Unexpected deadlock across two separate rows, using Postgres 17 and Django's select_for_update()

Browse pgsql-hackers by date

  From Date Subject
Next Message Manni Wood 2026-03-08 19:45:44 Re: Speed up COPY FROM text/CSV parsing using SIMD
Previous Message Tomas Vondra 2026-03-08 19:31:22 Re: tid_blockno() and tid_offset() accessor functions