Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: michael(at)paquier(dot)xyz, david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-06-30 06:00:45
Message-ID: CAD21AoCyTcYzFocKhtpWMBp0V27d3sLpwnrbu=HaDSOHMxSFXg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 25, 2025 at 4:35 PM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
>
> Hi,
>
> In <CAD21AoC19fV5Ujs-1r24MNU+hwTQUeZMEnaJDjSFwHLMMdFi0Q(at)mail(dot)gmail(dot)com>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 25 Jun 2025 00:48:46 +0900,
> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> >> >> It's natural to add more related APIs with this
> >> >> approach. The single registration API provides one feature
> >> >> by one operation. If we use the RegisterCopyRoutine() for
> >> >> FROM and TO formats API, it's not natural that we add more
> >> >> related APIs. In this case, some APIs may provide multiple
> >> >> features by one operation and other APIs may provide single
> >> >> feature by one operation. Developers may be confused with
> >> >> the API. For example, developers may think "what does mean
> >> >> NULL here?" or "can we use NULL here?" for
> >> >> "RegisterCopyRoutine("new-format", NewFormatFromRoutine,
> >> >> NULL)".
> >> >
> >> > We can document it in the comment for the registration function.
> >>
> >> I think that API that can be understandable without the
> >> additional note is better API than API that needs some
> >> notes.
> >
> > I don't see much difference in this case.
>
> OK. It seems that we can't agree on which API is better.
>
> I've implemented your idea as the v42 patch set. Can we
> proceed this proposal with this approach? What is the next
> step?

I'll review the patches. In the meanwhile could you update the
documentation accordingly?

>
> > No. I think that if extensions are likely to support both
> > CopyToRoutine and CopyFromRoutine in most cases, it would be simpler
> > to register the custom format using a single API. Registering
> > CopyToRoutine and CopyFromRoutine separately seems redundant to me.
>
> I don't think so. In general, extensions are implemented
> step by step. Extension developers will not implement
> CopyToRoutine and CopyFromRoutine at once even if extensions
> implement both of CopyToRoutine and CopyFromRoutine
> eventually.

Hmm, I think if the extension eventually implements both directions,
it would make sense to provide the single API.

>
> > Could you provide some examples? It seems to me that even if we
> > provide the single API for the registration we can provide other APIs
> > differently. For example, if we want to provide an API to register a
> > custom option, we can provide RegisterCopyToOption() and
> > RegisterCopyFromOption().
>
> Yes. We can mix different style APIs. In general, consistent
> style APIs is easier to use than mixed style APIs. If it's
> not an important point in PostgreSQL API design, my point is
> meaningless. (Sorry, I'm not familiar with PostgreSQL API
> design.)

As far as I know, there is no standard for PostgreSQL API design, but
I don't find any weirdness in this design.

>
> > My point is about the consistency of registration behavior. I think
> > that we should raise an error if the custom format name that an
> > extension tries to register already exists. Therefore I'm not sure why
> > installing extension-A+B is okay but installing extension-C+A or
> > extension-C+B is not okay? We can think that's an extension-A's choice
> > not to implement CopyFromRoutine for the 'myformat' format so
> > extension-B should not change it.
>
> I think that it's the users' responsibility. I think that
> it's more convenient that users can mix extension-A+B (A
> provides only TO format and B provides only FROM format)
> than users can't mix them. I think that extension-A doesn't
> want to prohibit FROM format in the case. Extension-A just
> doesn't care about FROM format.
>
> FYI: Both of extension-C+A and extension-C+B are OK when we
> update not raising an error existing format.

I want to keep the basic design that one custom format comes from one
extension because it's straightforward for both of us and users and
easy to maintain format ID. IIUC we somewhat agreed on this design in
the previous API design (TABLESAMPLE like API).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2025-06-30 06:02:36 Re: Suggestion to add --continue-client-on-abort option to pgbench
Previous Message Masahiko Sawada 2025-06-30 05:45:37 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart