RE: Parallel copy

From: "Hou, Zhijie" <houzj(dot)fnst(at)cn(dot)fujitsu(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Parallel copy
Date: 2020-11-05 13:02:52
Message-ID: 8f241649021d4cac85e884aac7166656@G08CNEXMBPEKD05.g08.fujitsu.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

>
> my $bytes = $ARGV[0];
> for(my $i = 0; $i < $bytes; $i+=8){
> print "longdata";
> }
> print "\n";
> --------
>
> postgres=# copy longdata from program 'perl /tmp/longdata.pl 100000000'
> with (parallel 2);
>
> This gets stuck forever (or at least I didn't have the patience to wait
> it finish). Both worker processes are consuming 100% of CPU.

I had a look over this problem.

the ParallelCopyDataBlock has size limit:
uint8 skip_bytes;
char data[DATA_BLOCK_SIZE]; /* data read from file */

It seems the input line is so long that the leader process run out of the Shared memory among parallel copy workers.
And the leader process keep waiting free block.

For the worker process, it wait util line_state becomes LINE_LEADER_POPULATED,
But leader process won't set the line_state unless it read the whole line.

So it stuck forever.
May be we should reconsider about this situation.

The stack is as follows:

Leader stack:
#3 0x000000000075f7a1 in WaitLatch (latch=<optimized out>, wakeEvents=wakeEvents(at)entry=41, timeout=timeout(at)entry=1, wait_event_info=wait_event_info(at)entry=150994945) at latch.c:411
#4 0x00000000005a9245 in WaitGetFreeCopyBlock (pcshared_info=pcshared_info(at)entry=0x7f26d2ed3580) at copyparallel.c:1546
#5 0x00000000005a98ce in SetRawBufForLoad (cstate=cstate(at)entry=0x2978a88, line_size=67108864, copy_buf_len=copy_buf_len(at)entry=65536, raw_buf_ptr=raw_buf_ptr(at)entry=65536,
copy_raw_buf=copy_raw_buf(at)entry=0x7fff4cdc0e18) at copyparallel.c:1572
#6 0x00000000005a1963 in CopyReadLineText (cstate=cstate(at)entry=0x2978a88) at copy.c:4058
#7 0x00000000005a4e76 in CopyReadLine (cstate=cstate(at)entry=0x2978a88) at copy.c:3863

Worker stack:
#0 GetLinePosition (cstate=cstate(at)entry=0x29e1f28) at copyparallel.c:1474
#1 0x00000000005a8aa4 in CacheLineInfo (cstate=cstate(at)entry=0x29e1f28, buff_count=buff_count(at)entry=0) at copyparallel.c:711
#2 0x00000000005a8e46 in GetWorkerLine (cstate=cstate(at)entry=0x29e1f28) at copyparallel.c:885
#3 0x00000000005a4f2e in NextCopyFromRawFields (cstate=cstate(at)entry=0x29e1f28, fields=fields(at)entry=0x7fff4cdc0b48, nfields=nfields(at)entry=0x7fff4cdc0b44) at copy.c:3615
#4 0x00000000005a50af in NextCopyFrom (cstate=cstate(at)entry=0x29e1f28, econtext=econtext(at)entry=0x2a358d8, values=0x2a42068, nulls=0x2a42070) at copy.c:3696
#5 0x00000000005a5b90 in CopyFrom (cstate=cstate(at)entry=0x29e1f28) at copy.c:2985

Best regards,
houzj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2020-11-05 13:34:06 Re: redundant error messages
Previous Message Daniel Gustafsson 2020-11-05 12:59:11 Re: Move OpenSSL random under USE_OPENSSL_RANDOM