Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, michael(at)paquier(dot)xyz, david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-07-17 20:33:13
Message-ID: CAD21AoAQkjU=o0nX4y0jtX0BnsrqA04g2ABqrUwjT88YeEWarA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 15, 2025 at 5:37 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2025-07-14 03:28:16 +0900, Masahiko Sawada wrote:
> > I've reviewed the 0001 and 0002 patches. The API implemented in the
> > 0002 patch looks good to me, but I'm concerned about the capsulation
> > of copy state data. With the v42 patches, we pass the whole
> > CopyToStateData to the extension codes, but most of the fields in
> > CopyToStateData are internal working state data that shouldn't be
> > exposed to extensions. I think we need to sort out which fields are
> > exposed or not. That way, it would be safer and we would be able to
> > avoid exposing copyto_internal.h and extensions would not need to
> > include copyfrom_internal.h.
> >
> > I've implemented a draft patch for that idea. In the 0001 patch, I
> > moved fields that are related to internal working state from
> > CopyToStateData to CopyToExectuionData. COPY routine APIs pass a
> > pointer of CopyToStateData but extensions can access only fields
> > except for CopyToExectuionData. In the 0002 patch, I've implemented
> > the registration API and some related APIs based on your v42 patch.
> > I've made similar changes to COPY FROM codes too.
>
> I've not followed the development of this patch - but I continue to be
> concerned about the performance impact it has as-is and the amount of COPY
> performance improvements it forecloses.
>
> This seems to add yet another layer of indirection to a lot of hot functions
> like CopyGetData() etc.
>

The most refactoring works have been done by commit 7717f6300 and
2e4127b6d with a slight performance gain. At this stage, we're trying
to introduce the registration API so that extensions can provide their
callbacks to the core. Some functions required for I/O such as
CopyGetData() and CopySendEndOfRow() would be exposed but I'm not
going to add additional indirection function call layers.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2025-07-17 20:39:12 Re: Parallel heap vacuum
Previous Message Masahiko Sawada 2025-07-17 19:42:27 Re: POC: Parallel processing of indexes in autovacuum