Re: Make COPY format extendable: Extract COPY TO format implementations

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-02-09 04:21:34
Message-ID: ZcWoTr1N0GELFA9E@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 07, 2024 at 01:33:18PM +0900, Michael Paquier wrote:
> Hmm. That explains why I was not seeing any differences with this
> callback then. It seems to me that the order of actions to take is
> clear, like:
> - Revert 2889fd23be56 to keep a clean state of the tree, now done with
> 1aa8324b81fa.
> - Dive into the strlen() issue, as it really looks like this can
> create more simplifications for the patch discussed on this thread
> with COPY TO.

This has been done this morning with b619852086ed.

> - Revisit what we have here, looking at more profiles to see how HEAD
> an v13 compare. It looks like we are on a good path, but let's tackle
> things one step at a time.

And attached is a v14 that's rebased on HEAD. While on it, I've
looked at more profiles and did more runtime checks.

Some runtimes, in (ms), average of 15 runs, 30 int attributes on 5M
rows as mentioned above:
COPY FROM text binary
HEAD 6066 7110
v14 6087 7105
COPY TO text binary
HEAD 6591 10161
v14 6508 10189

And here are some profiles, where I'm not seeing an impact at
row-level with the addition of the callbacks:
COPY FROM, text, master:
- 66.59% 16.10% postgres postgres [.] NextCopyFrom ▒ - 50.50% NextCopyFrom
- 30.75% NextCopyFromRawFields
+ 15.93% CopyReadLine
13.73% CopyReadAttributesText
- 19.43% InputFunctionCallSafe
+ 13.49% int4in
0.77% pg_strtoint32_safe
+ 16.10% _start
COPY FROM, text, v14:
- 66.42% 0.74% postgres postgres [.] NextCopyFrom
- 65.67% NextCopyFrom
- 65.51% CopyFromTextOneRow
- 30.25% NextCopyFromRawFields
+ 16.14% CopyReadLine
13.40% CopyReadAttributesText
- 18.96% InputFunctionCallSafe
+ 13.15% int4in
0.70% pg_strtoint32_safe
+ 0.74% _start

COPY TO, binary, master
- 90.32% 7.14% postgres postgres [.] CopyOneRowTo
- 83.18% CopyOneRowTo
+ 60.30% SendFunctionCall
+ 10.99% appendBinaryStringInfo
+ 3.67% MemoryContextReset
+ 2.89% CopySendEndOfRow
0.89% memcpy(at)plt
0.66% 0xffffa052db5c
0.62% enlargeStringInfo
0.56% pgstat_progress_update_param
+ 7.14% _start
COPY TO, binary, v14
- 90.96% 0.21% postgres postgres [.] CopyOneRowTo
- 90.75% CopyOneRowTo
- 81.86% CopyToBinaryOneRow
+ 59.17% SendFunctionCall
+ 10.56% appendBinaryStringInfo
1.10% enlargeStringInfo
0.59% int4send
0.57% memcpy(at)plt
+ 3.68% MemoryContextReset
+ 2.83% CopySendEndOfRow
1.13% appendBinaryStringInfo
0.58% SendFunctionCall
0.58% pgstat_progress_update_param

Are there any comments about this v14? Sutou-san?

A next step I think we could take is to split the binary-only and the
text/csv-only data in each cstate into their own structure to make the
structure, with an opaque pointer that custom formats could use, but a
lot of fields are shared as well. This patch is already complicated
enough IMO, so I'm OK to leave it out for the moment, and focus on
making this infra pluggable as a next step.
--
Michael

Attachment Content-Type Size
v14-0001-Extract-COPY-FROM-TO-format-implementations.patch text/x-diff 36.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-02-09 04:26:43 Re: Synchronizing slots from primary to standby
Previous Message Sutou Kouhei 2024-02-09 04:19:50 Re: Make COPY format extendable: Extract COPY TO format implementations