Re: Make COPY format extendable: Extract COPY TO format implementations

From: Sutou Kouhei <kou(at)clear-code(dot)com>
To: michael(at)paquier(dot)xyz
Cc: sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-02-02 08:46:18
Message-ID: 20240202.174618.349091157390883477.kou@clear-code.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In <ZbyiDHIrxRgzYT99(at)paquier(dot)xyz>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 2 Feb 2024 17:04:28 +0900,
Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> One idea I was considering is whether we should use a special value in
> the "format" DefElem, say "custom:$my_custom_format" where it would be
> possible to bypass the formay check when processing options and find
> the routines after processing all the options. I'm not wedded to
> that, but attaching the routines to the state data is IMO the correct
> thing, because this has nothing to do with CopyFormatOptions.

Thanks for sharing your idea.
Let's discuss how to support custom options after we
complete the current performance changes.

>> I'm OK with the approach. But how about adding the extra
>> callbacks to Copy{From,To}StateData not
>> Copy{From,To}Routines like CopyToStateData::data_dest_cb and
>> CopyFromStateData::data_source_cb? They are only needed for
>> "text" and "csv". So we don't need to add them to
>> Copy{From,To}Routines to keep required callback minimum.
>
> And set them in cstate while we are in the Start routine, right?

I imagined that it's done around the following part:

@@ -1418,6 +1579,9 @@ BeginCopyFrom(ParseState *pstate,
/* Extract options from the statement node tree */
ProcessCopyOptions(pstate, &cstate->opts, true /* is_from */ , options);

+ /* Set format routine */
+ cstate->routine = CopyFromGetRoutine(cstate->opts);
+
/* Process the target relation */
cstate->rel = rel;

Example1:

/* Set format routine */
cstate->routine = CopyFromGetRoutine(cstate->opts);
if (!cstate->opts.binary)
if (cstate->opts.csv_mode)
cstate->copy_read_attributes = CopyReadAttributesCSV;
else
cstate->copy_read_attributes = CopyReadAttributesText;

Example2:

static void
CopyFromSetRoutine(CopyFromState cstate)
{
if (cstate->opts.csv_mode)
{
cstate->routine = &CopyFromRoutineCSV;
cstate->copy_read_attributes = CopyReadAttributesCSV;
}
else if (cstate.binary)
cstate->routine = &CopyFromRoutineBinary;
else
{
cstate->routine = &CopyFromRoutineText;
cstate->copy_read_attributes = CopyReadAttributesText;
}
}

BeginCopyFrom()
{
/* Set format routine */
CopyFromSetRoutine(cstate);
}

But I don't object your original approach. If we have the
extra callbacks in Copy{From,To}Routines, I just don't use
them for my custom format extension.

>> What is the better next action for us? Do you want to
>> complete the WIP v11 patch set by yourself (and commit it)?
>> Or should I take over it?
>
> I was planning to work on that, but wanted to be sure how you felt
> about the problem with text and csv first.

OK.
My opinion is the above. I have an idea how to implement it
but it's not a strong idea. You can choose whichever you like.

Thanks,
--
kou

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2024-02-02 09:00:00 Re: ResourceOwner refactoring
Previous Message Yugo NAGATA 2024-02-02 08:41:56 Re: Checking MINIMUM_VERSION_FOR_WAL_SUMMARIES