Re: Make COPY format extendable: Extract COPY TO format implementations

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Sutou Kouhei <kou(at)clear-code(dot)com>, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2023-12-08 05:17:42
Message-ID: ZXKm9tmnSPIVrqZz@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 08, 2023 at 10:32:27AM +0800, Junwang Zhao wrote:
> I can see FDW related utility commands but no TABLESAMPLE related,
> and there is a pg_foreign_data_wrapper system catalog which has
> a *fdwhandler* field.

+ */ +CATALOG(pg_copy_handler,4551,CopyHandlerRelationId)

Using a catalog is an over-engineered design. Others have provided
hints about that upthread, but it would be enough to have one or two
handler types that are wrapped around one or two SQL *functions*, like
tablesamples. It seems like you've missed it, but feel free to read
about tablesample-method.sgml, that explains how this is achieved for
tablesamples.

> If we want extensions to create a new copy handler, I think
> something like pg_copy_hander should be necessary.

A catalog is not necessary, that's the point, because it can be
replaced by a scan of pg_proc with the function name defined in a COPY
query (be it through a FORMAT, or different option in a DefElem).
An example of extension with tablesamples is contrib/tsm_system_rows/,
that just uses a function returning a tsm_handler:
CREATE FUNCTION system_rows(internal)
RETURNS tsm_handler
AS 'MODULE_PATHNAME', 'tsm_system_rows_handler'
LANGUAGE C STRICT;

Then SELECT queries rely on the contents of the TABLESAMPLE clause to
find the set of callbacks it should use by calling the function.

+/* Routines for a COPY HANDLER implementation. */
+typedef struct CopyRoutine
+{

FWIW, I find weird the concept of having one handler for both COPY
FROM and COPY TO as each one of them has callbacks that are mutually
exclusive to the other, but I'm OK if there is a consensus of only
one. So I'd suggest to use *two* NodeTags instead for a cleaner
split, meaning that we'd need two functions for each method. My point
is that a custom COPY handler could just define a COPY TO handler or a
COPY FROM handler, though it mostly comes down to a matter of taste
regarding how clean the error handling becomes if one tries to use a
set of callbacks with a COPY type (TO or FROM) not matching it.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-12-08 05:23:27 Re: Remove MSVC scripts from the tree
Previous Message Yura Sokolov 2023-12-08 05:10:29 Re: [PATCH] New [relation] option engine