Re: Parallel copy

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-10-03 00:49:59
Message-ID: 20201003004959.73ot57oeikhtuq4u@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Vignesh,

I've done some basic benchmarking on the v4 version of the patches (but
AFAIKC the v5 should perform about the same), and some initial review.

For the benchmarking, I used the lineitem table from TPC-H - for 75GB
data set, this largest table is about 64GB once loaded, with another
54GB in 5 indexes. This is on a server with 32 cores, 64GB of RAM and
NVME storage.

The COPY duration with varying number of workers (specified using the
parallel COPY option) looks like this:

workers duration
---------------------
0 1366
1 1255
2 704
3 526
4 434
5 385
6 347
7 322
8 327

So this seems to work pretty well - initially we get almost linear
speedup, then it slows down (likely due to contention for locks, I/O
etc.). Not bad.

I've only done a quick review, but overall the patch looks in fairly
good shape.

1) I don't quite understand why we need INCREMENTPROCESSED and
RETURNPROCESSED, considering it just does ++ or return. It just
obfuscated the code, I think.

2) I find it somewhat strange that BeginParallelCopy can just decide not
to do parallel copy after all. Why not to do this decisions in the
caller? Or maybe it's fine this way, not sure.

3) AFAIK we don't modify typedefs.list in patches, so these changes
should be removed.

4) IsTriggerFunctionParallelSafe actually checks all triggers, not just
one, so the comment needs minor rewording.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2020-10-03 02:05:59 Re: Improve choose_custom_plan for initial partition prune case
Previous Message James Coleman 2020-10-02 23:07:00 Re: enable_incremental_sort changes query behavior