Re: Parallel copy

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-09-24 07:04:41
Message-ID: CALj2ACW8mQ+e699fynEZ83rBPBH+BOxQd8N-mvrnmcu1KP2kCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Greg for the testing.

On Thu, Sep 24, 2020 at 8:27 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> > 3. Could you please run the test case 3 times at least? Just to ensure
the consistency of the issue.
>
> Yes, have run 4 times. Seems to be a performance hit (whether normal
> copy or parallel-1 copy) on the first COPY run on a freshly created
> database. After that, results are consistent.
>

From the logs, I see that it is happening only with default
postgresql.conf, and there's inconsistency in table insertion times,
especially from the 1st time to 2nd time. Also, the table insertion time
variation is more. This is expected with the default postgresql.conf,
because of the background processes interference. That's the reason we
usually run with custom configuration to correctly measure the performance
gain.

br_default_0_1.log:
2020-09-23 22:32:36.944 JST [112616] LOG: totaltableinsertiontime =
155068.244 ms
2020-09-23 22:33:57.615 JST [11426] LOG: totaltableinsertiontime =
42096.275 ms
2020-09-23 22:37:39.192 JST [43097] LOG: totaltableinsertiontime =
29135.262 ms
2020-09-23 22:38:56.389 JST [54205] LOG: totaltableinsertiontime =
38953.912 ms
2020-09-23 22:40:27.573 JST [66485] LOG: totaltableinsertiontime =
27895.326 ms
2020-09-23 22:41:34.948 JST [77523] LOG: totaltableinsertiontime =
28929.642 ms
2020-09-23 22:43:18.938 JST [89857] LOG: totaltableinsertiontime =
30625.015 ms
2020-09-23 22:44:21.938 JST [101372] LOG: totaltableinsertiontime =
24624.045 ms

br_default_1_0.log:
2020-09-24 11:12:14.989 JST [56146] LOG: totaltableinsertiontime =
192068.350 ms
2020-09-24 11:13:38.228 JST [88455] LOG: totaltableinsertiontime =
30999.942 ms
2020-09-24 11:15:50.381 JST [108935] LOG: totaltableinsertiontime =
31673.204 ms
2020-09-24 11:17:14.260 JST [118541] LOG: totaltableinsertiontime =
31367.027 ms
2020-09-24 11:20:18.975 JST [17270] LOG: totaltableinsertiontime =
26858.924 ms
2020-09-24 11:22:17.822 JST [26852] LOG: totaltableinsertiontime =
66531.442 ms
2020-09-24 11:24:09.221 JST [47971] LOG: totaltableinsertiontime =
38943.384 ms
2020-09-24 11:25:30.955 JST [58849] LOG: totaltableinsertiontime =
28286.634 ms

br_custom_0_1.log:
2020-09-24 10:29:44.956 JST [110477] LOG: totaltableinsertiontime =
20207.928 ms
2020-09-24 10:30:49.570 JST [120568] LOG: totaltableinsertiontime =
23360.006 ms
2020-09-24 10:32:31.659 JST [2753] LOG: totaltableinsertiontime =
19837.588 ms
2020-09-24 10:35:49.245 JST [31118] LOG: totaltableinsertiontime =
21759.253 ms
2020-09-24 10:36:54.834 JST [41763] LOG: totaltableinsertiontime =
23547.323 ms
2020-09-24 10:38:53.507 JST [56779] LOG: totaltableinsertiontime =
21543.984 ms
2020-09-24 10:39:58.713 JST [67489] LOG: totaltableinsertiontime =
25254.563 ms

br_custom_1_0.log:
2020-09-24 10:49:03.242 JST [15308] LOG: totaltableinsertiontime =
16541.201 ms
2020-09-24 10:50:11.848 JST [23324] LOG: totaltableinsertiontime =
15076.577 ms
2020-09-24 10:51:24.497 JST [35394] LOG: totaltableinsertiontime =
16400.777 ms
2020-09-24 10:52:32.354 JST [42953] LOG: totaltableinsertiontime =
15591.051 ms
2020-09-24 10:54:30.327 JST [61136] LOG: totaltableinsertiontime =
16700.954 ms
2020-09-24 10:55:38.377 JST [68719] LOG: totaltableinsertiontime =
15435.150 ms
2020-09-24 10:57:08.927 JST [83335] LOG: totaltableinsertiontime =
17133.251 ms
2020-09-24 10:58:17.420 JST [90905] LOG: totaltableinsertiontime =
15352.753 ms

>
> Test results show that Parallel COPY with 1 worker is performing
> better than normal COPY in the test scenarios run.
>

Good to know :)

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-24 07:12:53 Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration
Previous Message Michael Paquier 2020-09-24 06:32:58 Re: BUG #15383: Join Filter cost estimation problem in 10.5