Re: Parallel copy

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel copy
Date: 2020-11-18 06:09:07
Message-ID: CALj2ACXttbVQa0L46nnHVJN7n70gazCTEKNm0dehh_D70Zc01Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 29, 2020 at 2:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> 4) Worker has to hop through all the processed chunks before getting
> the chunk which it can process.
>
> One more point, I have noticed that some time back [1], I have given
> one suggestion related to the way workers process the set of lines
> (aka chunk). I think you can try by increasing the chunk size to say
> 100, 500, 1000 and use some shared counter to remember the number of
> chunks processed.
>

Hi, I did some analysis on using spinlock protected worker write position
i.e. each worker acquires spinlock on a shared write position to choose the
next available chunk vs each worker hops to get the next available chunk
position:

Use Case: 10mn rows, 5.6GB data, 2 indexes on integer columns, 1 index on
text column, results are of the form (no of workers, total exec time in
sec, index insertion time in sec, worker write pos get time in sec, buffer
contention event count):

With spinlock:
(1,1126.443,1060.067,0.478,*0*), (2,669.343,630.769,0.306,*26*),
(4,346.297,326.950,0.161,*89*), (8,209.600,196.417,0.088,*291*),
(16,166.113,157.086,0.065,*1468*), (20,173.884,166.013,0.067,*2700*),
(30,173.087,1166.565,0.0065,*5346*)
Without spinlock:
(1,1119.695,1054.586,0.496,*0*), (2,645.733,608.313,1.5,*8*),
(4,340.620,320.344,1.6,*58*), (8,203.985,189.644,1.3,*222*),
(16,142.997,133.045,1,*813*), (20,132.621,122.527,1.1,*1215*),
(30,135.737,126.716,1.5,*2901*)

With spinlock each worker is getting the required write position quickly
and proceeding further till the index insertion(which is becoming a single
point of contention) where we observed more buffer lock contention. Reason
is that all the workers are reaching the index insertion point at the
similar time.

Without spinlock, each worker is spending some time in hopping to get the
write position, by the time the other workers are inserting into the
indexes. So basically, all the workers are not reaching the index insertion
point at the same time and hence less buffer lock contention.

The same behaviour(explained above) is observed with different worker chunk
count(default 64, 128, 512 and 1024) i.e. the number of tuples each worker
caches into its local memory before inserting into table.

In summary: with spinlock, it looks like we are able to avoid workers
waiting to get the next chunk, which also means that we are not creating
any contention point inside the parallel copy code. However this is causing
another choking point i.e. index insertion if indexes are available on the
table, which is out of scope of parallel copy code. We think that it would
be good to use spinlock-protected worker write position or an atomic
variable for worker write position(as it performs equal to spinlock or
little better in some platforms). Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Soumyadeep Chakraborty 2020-11-18 06:21:18 Re: Split copy.
Previous Message Shinoda, Noriyoshi (PN Japan FSI) 2020-11-18 06:06:07 RE: Tab complete for CREATE OR REPLACE TRIGGER statement