Re: Parallel copy

From: Greg Nancarrow <gregn4422(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-09-24 02:56:09
Message-ID: CAJcOf-dzYj9-8Fb9aLebi3BCq7sHnKHDAUcN0nG-MLromDC2DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Bharath,

> Few things to follow before testing:
> 1. Is the table being dropped/truncated after the test with 0 workers and before running with 1 worker? If not, then the index insertion time would increase.[1](for me it is increasing by 10 sec). This is obvious because the 1st time index will be created from bottom up manner(from leaves to root), but for the 2nd time it has to search and insert at the proper leaves and inner B+Tree nodes.

Yes, it' being truncated before running each and every COPY.

> 2. If possible, can you also run with custom postgresql.conf settings[2] along with default? Just to ensure that other bg processes such as checkpointer, autovacuum, bgwriter etc. don't affect our testcase. For instance, with default postgresql.conf file, it looks like checkpointing[3] is happening frequently, could you please let us know if that happens at your end?

Yes, have run with default and your custom settings. With default
settings, I can confirm that checkpointing is happening frequently
with the tests I've run here.

> 3. Could you please run the test case 3 times at least? Just to ensure the consistency of the issue.

Yes, have run 4 times. Seems to be a performance hit (whether normal
copy or parallel-1 copy) on the first COPY run on a freshly created
database. After that, results are consistent.

> 4. I ran the tests in a performance test system where no other user processes(except system processes) are running. Is it possible for you to do the same?
>
> Please capture and share the timing logs with us.
>

Yes, I have ensured the system is as idle as possible prior to testing.

I have attached the test results obtained after building with your
Parallel Copy patch and testing patch applied (HEAD at
733fa9aa51c526582f100aa0d375e0eb9a6bce8b).

Test results show that Parallel COPY with 1 worker is performing
better than normal COPY in the test scenarios run. There is a
performance hit (regardless of COPY type) on the very first COPY run
on a freshly-created database.

I ran the test case 4 times. and also in reverse order, with truncate
run before each COPY (output and logs named xxxx_0_1 run normal COPY
then parallel COPY, and named xxxx_1_0 run parallel COPY and then
normal COPY).

Please refer to attached results.

Regards,
Greg

Attachment Content-Type Size
testing_patch_results.tar.gz application/gzip 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-24 03:17:46 Re: Prefer TG_TABLE_NAME over TG_RELNAME in tests
Previous Message Michael Paquier 2020-09-24 02:53:14 scram-sha-256 broken with FIPS and OpenSSL 1.0.2