Re: Make COPY format extendable: Extract COPY TO format implementations

From: Sutou Kouhei <kou(at)clear-code(dot)com>
To: michael(at)paquier(dot)xyz
Cc: andres(at)anarazel(dot)de, sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-03-04 05:11:08
Message-ID: 20240304.141108.377465274442209834.kou@clear-code.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In <20240301(dot)154443(dot)618034282613922707(dot)kou(at)clear-code(dot)com>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Fri, 01 Mar 2024 15:44:43 +0900 (JST),
Sutou Kouhei <kou(at)clear-code(dot)com> wrote:

>> I guess so. It does not make much of a difference, though. The thing
>> is that the dispatch caused by the custom callbacks called for each
>> row is noticeable in any profiles I'm taking (not that much in the
>> worst-case scenarios, still a few percents), meaning that this impacts
>> the performance for all the in-core formats (text, csv, binary) as
>> long as we refactor text/csv/binary to use the routines of copyapi.h.
>> I don't really see a way forward, except if we don't dispatch the
>> in-core formats to not impact the default cases. That makes the code
>> a bit less elegant, but equally efficient for the existing formats.
>
> It's an option based on your profile result but your
> execution result also shows that v15 is faster than HEAD [1]:
>
>> I am getting faster runtimes with v15 (6232ms in average)
>> vs HEAD (6550ms) at 5M rows with COPY TO
>
> [1] https://www.postgresql.org/message-id/flat/ZdbtQJ-p5H1_EDwE%40paquier.xyz#6439e6ad574f2d47cd7220e9bfed3889
>
> I think that faster runtime is beneficial than mysterious
> profile for users. So I think that we can merge v15 to
> master.

If this is a blocker of making COPY format extendable, can
we defer moving the existing text/csv/binary format
implementations to Copy{From,To}Routine for now as Michael
suggested to proceed making COPY format extendable? (Can we
add Copy{From,To}Routine without changing the existing
text/csv/binary format implementations?)

I attach a patch for it.

There is a large hunk for CopyOneRowTo() that is caused by
indent change. I also attach "...-w.patch" that uses "git
-w" to remove space only changes. "...-w.patch" is only for
review. We should use .patch without -w for push.

Thanks,
--
kou

Attachment Content-Type Size
v16-0001-Add-CopyFromRoutine-CopyToRountine.patch text/x-patch 12.6 KB
v16-0001-Add-CopyFromRoutine-CopyToRountine-w.patch text/x-patch 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-03-04 05:16:15 Re: Injection points: some tools to wait and wake
Previous Message Andrei Lepikhov 2024-03-04 04:57:56 Re: a wrong index choose when statistics is out of date