Re: Make COPY format extendable: Extract COPY TO format implementations

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2023-12-07 00:38:59
Message-ID: ZXEUIy6wl4jHy6Nm@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 06, 2023 at 10:07:51PM +0800, Junwang Zhao wrote:
> I read the thread[1] you posted and I think Andres's suggestion sounds great.
>
> Should we extract both *copy to* and *copy from* for the first step, in that
> case we can add the pg_copy_handler catalog smoothly later.
>
> Attached V4 adds 'extract copy from' and it passed the cirrus ci,
> please take a look.
>
> I added a hook *copy_from_end* but this might be removed later if not used.
>
> [1]: https://www.postgresql.org/message-id/20180211211235.5x3jywe5z3lkgcsr%40alap3.anarazel.de

I was looking at the differences between v3 posted by Sutou-san and
v4 from you, seeing that:

+/* Routines for a COPY HANDLER implementation. */
+typedef struct CopyHandlerOps
{
/* Called when COPY TO is started. This will send a header. */
- void (*start) (CopyToState cstate, TupleDesc tupDesc);
+ void (*copy_to_start) (CopyToState cstate, TupleDesc tupDesc);

/* Copy one row for COPY TO. */
- void (*one_row) (CopyToState cstate, TupleTableSlot *slot);
+ void (*copy_to_one_row) (CopyToState cstate, TupleTableSlot *slot);

/* Called when COPY TO is ended. This will send a trailer. */
- void (*end) (CopyToState cstate);
-} CopyToFormatOps;
+ void (*copy_to_end) (CopyToState cstate);
+
+ void (*copy_from_start) (CopyFromState cstate, TupleDesc tupDesc);
+ bool (*copy_from_next) (CopyFromState cstate, ExprContext *econtext,
+ Datum *values, bool *nulls);
+ void (*copy_from_error_callback) (CopyFromState cstate);
+ void (*copy_from_end) (CopyFromState cstate);
+} CopyHandlerOps;

And we've spent a good deal of time refactoring the copy code so as
the logic behind TO and FROM is split. Having a set of routines that
groups both does not look like a step in the right direction to me,
and v4 is an attempt at solving two problems, while v3 aims to improve
one case. It seems to me that each callback portion should be focused
on staying in its own area of the code, aka copyfrom*.c or copyto*.c.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2023-12-07 00:39:11 Re: Emitting JSON to file using COPY TO
Previous Message Noah Misch 2023-12-07 00:33:33 Re: reindexing an invalid index should not use ERRCODE_INDEX_CORRUPTED