Re: Make COPY format extendable: Extract COPY TO format implementations

From: Sutou Kouhei <kou(at)clear-code(dot)com>
To: michael(at)paquier(dot)xyz
Cc: sawada(dot)mshk(at)gmail(dot)com, david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-06-18 03:59:20
Message-ID: 20250618.125920.377159334539867546.kou@clear-code.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In <aFC5HmZHU5NCPuTL(at)paquier(dot)xyz>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Tue, 17 Jun 2025 09:38:54 +0900,
Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Tue, Jun 17, 2025 at 08:50:37AM +0900, Sutou Kouhei wrote:
>> OK. I'll implement the initial version with this
>> design. (Allocating IDs local not shared.)
>
> Sounds good to me. Thanks Sutou-san!

I've attached the v41 patch set that uses the C API approach
with local (not shared) COPY routine management.

0001: This is same as 0001 in the v40 patch set. It just
cleans up CopySource and CopyDest enums.
0002: This is the initial version of this approach.

Here are some discussion points:

1. This provides 2 registration APIs
(RegisterCopy{From,To}Routine(name, routine)) instead of
1 registration API (RegisterCopyFormat(name,
from_routine, to_routine)).

It's for simple implementation and easy to extend without
breaking APIs in the future. (And some formats may
provide only FROM routine or TO routine.)

Is this design acceptable?

FYI: RegisterCopy{From,To}Routine() uses the same logic
as RegisterExtensionExplainOption().

2. This allocates IDs internally but doesn't provide APIs
that get them. Because it's not needed for now.

We can provide GetExplainExtensionId() like API when we
need it.

Is this design acceptable?

3. I want to register the built-in COPY {FROM,TO} routines
in the PostgreSQL initialization phase. Where should we
do it? In 0002, it's done in InitPostgres() but I'm not
sure whether it's a suitable location or not.

4. 0002 adds CopyFormatOptions::routine as union:

@@ -87,9 +91,14 @@ typedef struct CopyFormatOptions
CopyLogVerbosityChoice log_verbosity; /* verbosity of logged messages */
int64 reject_limit; /* maximum tolerable number of errors */
List *convert_select; /* list of column names (can be NIL) */
+ union
+ {
+ const struct CopyFromRoutine *from; /* for COPY FROM */
+ const struct CopyToRoutine *to; /* for COPY TO */
+ } routine; /* routine to process the specified format */
} CopyFormatOptions;

Because one of Copy{From,To}Routine is only needed at
once. Is this union usage strange in PostgreSQL?

5. 0002 adds InitializeCopy{From,To}Routines() and
GetCopy{From,To}Routine() that aren't used by COPY
{FROM,TO} routine implementations to copyapi.h. Should we
move them to other .h? If so, which .h should be used for
them?

Thanks,
--
kou

Attachment Content-Type Size
v41-0001-Export-CopyDest-as-private-data.patch text/x-patch 8.6 KB
v41-0002-Add-support-for-registering-COPY-FROM-TO-routine.patch text/x-patch 27.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2025-06-18 04:07:34 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Peter Eisentraut 2025-06-18 03:58:16 Re: Huge commitfest app update upcoming: Tags, Draft CF, Help page, and automated commitfest creat/open/close