From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Sutou Kouhei <kou(at)clear-code(dot)com> |
Cc: | michael(at)paquier(dot)xyz, david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2025-06-24 02:59:17 |
Message-ID: | CAD21AoA57owo6qYTPTxOtCjDmcuj1tGL1aGs95cvEnoLQvwF0A@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jun 18, 2025 at 12:59 PM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
>
> Hi,
>
> In <aFC5HmZHU5NCPuTL(at)paquier(dot)xyz>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Tue, 17 Jun 2025 09:38:54 +0900,
> Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> > On Tue, Jun 17, 2025 at 08:50:37AM +0900, Sutou Kouhei wrote:
> >> OK. I'll implement the initial version with this
> >> design. (Allocating IDs local not shared.)
> >
> > Sounds good to me. Thanks Sutou-san!
>
> I've attached the v41 patch set that uses the C API approach
> with local (not shared) COPY routine management.
>
> 0001: This is same as 0001 in the v40 patch set. It just
> cleans up CopySource and CopyDest enums.
> 0002: This is the initial version of this approach.
Thank you for updating the patches!
> Here are some discussion points:
>
> 1. This provides 2 registration APIs
> (RegisterCopy{From,To}Routine(name, routine)) instead of
> 1 registration API (RegisterCopyFormat(name,
> from_routine, to_routine)).
>
> It's for simple implementation and easy to extend without
> breaking APIs in the future. (And some formats may
> provide only FROM routine or TO routine.)
>
> Is this design acceptable?
With the single registration API idea, we can register the custom
format routine that supports only FROM routine using the API like:
RegisterCopyRoutine("new-format", NewFormatFromRoutine, NULL);
Compared to this approach, what points do you think having separate
two registration APIs is preferable in terms of extendability and API
compatibility? I think it would be rather confusing for example if
each COPY TO routine and COPY FROM routine is registered by different
extensions with the same format name.
> FYI: RegisterCopy{From,To}Routine() uses the same logic
> as RegisterExtensionExplainOption().
I'm concerned that the patch has duplicated logics for the
registration of COPY FROM and COPY TO.
>
> 2. This allocates IDs internally but doesn't provide APIs
> that get them. Because it's not needed for now.
>
> We can provide GetExplainExtensionId() like API when we
> need it.
>
> Is this design acceptable?
+1
>
> 3. I want to register the built-in COPY {FROM,TO} routines
> in the PostgreSQL initialization phase. Where should we
> do it? In 0002, it's done in InitPostgres() but I'm not
> sure whether it's a suitable location or not.
InitPostgres() is not a correct function as it's a process
initialization function. Probably we don't necessarily need to
register the built-in formats in the same way as custom formats. A
simple solution would be to have separate arrays for built-in formats
and custom formats and have the GetCopy[To|From]Routine() search both
arrays (built-in array first).
> 4. 0002 adds CopyFormatOptions::routine as union:
>
> @@ -87,9 +91,14 @@ typedef struct CopyFormatOptions
> CopyLogVerbosityChoice log_verbosity; /* verbosity of logged messages */
> int64 reject_limit; /* maximum tolerable number of errors */
> List *convert_select; /* list of column names (can be NIL) */
> + union
> + {
> + const struct CopyFromRoutine *from; /* for COPY FROM */
> + const struct CopyToRoutine *to; /* for COPY TO */
> + } routine; /* routine to process the specified format */
> } CopyFormatOptions;
>
> Because one of Copy{From,To}Routine is only needed at
> once. Is this union usage strange in PostgreSQL?
I think we can live with having two fields as there are other options
that are used only in either COPY FROM or COPY TO.
>
> 5. 0002 adds InitializeCopy{From,To}Routines() and
> GetCopy{From,To}Routine() that aren't used by COPY
> {FROM,TO} routine implementations to copyapi.h. Should we
> move them to other .h? If so, which .h should be used for
> them?
As I commented at 3, I think it's better to avoid dynamically
registering the built-in formats.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Yugo Nagata | 2025-06-24 03:19:48 | Re: Per-role disabling of LEAKPROOF requirements for row-level security? |
Previous Message | Tomas Vondra | 2025-06-24 01:43:19 | Re: pgsql: Introduce pg_shmem_allocations_numa view |