Re: Make COPY format extendable: Extract COPY TO format implementations

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: sawada(dot)mshk(at)gmail(dot)com, zhjwpku(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-02-02 00:52:04
Message-ID: Zbw8tM4j2LbFgY6o@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 02, 2024 at 06:51:02AM +0900, Michael Paquier wrote:
> I am going to try to plug in some rusage() calls in the backend for
> the COPY paths. I hope that gives more precision about the backend
> activity. I'll post that with more numbers.

And here they are with log_statement_stats enabled to get rusage() fot
these queries:
test | user_s | system_s | elapsed_s
----------------------+----------+----------+-----------
head_to_bin_1col | 1.639761 | 0.007998 | 1.647762
v7_to_bin_1col | 1.645499 | 0.004003 | 1.649498
v10_to_bin_1col | 1.639466 | 0.004008 | 1.643488

head_to_bin_10col | 7.486369 | 0.056007 | 7.542485
v7_to_bin_10col | 7.314341 | 0.039990 | 7.354743
v10_to_bin_10col | 7.329355 | 0.052007 | 7.381408

head_to_text_1col | 1.581140 | 0.012000 | 1.593166
v7_to_text_1col | 1.615441 | 0.003992 | 1.619446
v10_to_text_1col | 1.613443 | 0.000000 | 1.613454

head_to_text_10col | 5.897014 | 0.011990 | 5.909063
v7_to_text_10col | 5.722872 | 0.016014 | 5.738979
v10_to_text_10col | 5.762286 | 0.011993 | 5.774265

head_from_bin_1col | 1.524038 | 0.020000 | 1.544046
v7_from_bin_1col | 1.551367 | 0.016015 | 1.567408
v10_from_bin_1col | 1.560087 | 0.016001 | 1.576115

head_from_bin_10col | 5.238444 | 0.139993 | 5.378595
v7_from_bin_10col | 5.170503 | 0.076021 | 5.246588
v10_from_bin_10col | 5.106496 | 0.112020 | 5.218565

head_from_text_1col | 1.664124 | 0.003998 | 1.668172
v7_from_text_1col | 1.720616 | 0.007990 | 1.728617
v10_from_text_1col | 1.683950 | 0.007990 | 1.692098

head_from_text_10col | 4.859651 | 0.015996 | 4.875747
v7_from_text_10col | 4.775975 | 0.032000 | 4.808051
v10_from_text_10col | 4.737512 | 0.028012 | 4.765522
(24 rows)

I'm looking at this table, and what I can see is still a lot of
variance in the tests with tables involving 1 attribute. However, a
second thing stands out to me here: there is a speedup with the
10-attribute case for all both COPY FROM and COPY TO, and both
formats. The data posted at [1] is showing me the same trend. In
short, let's move on with this split refactoring with the per-row
callbacks. That clearly shows benefits.

[1] https://www.postgresql.org/message-id/Zbr6piWuVHDtFFOl@paquier.xyz
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-02-02 01:15:36 Re: Synchronizing slots from primary to standby
Previous Message Richard Guo 2024-02-02 00:50:25 Re: set_cheapest without checking pathlist