Re: Parallel copy

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Subject: Re: Parallel copy
Date: 2020-09-16 13:05:56
Message-ID: CAE9k0Pnw0ZvzBUXMp+8hhAa4j+=p2F=Z3Kymq6_B-tGnYiZsgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Vignesh,

I've spent some time today looking at your new set of patches and I've
some thoughts and queries which I would like to put here:

Why are these not part of the shared cstate structure?

SerializeString(pcxt, PARALLEL_COPY_KEY_NULL_PRINT, cstate->null_print);
SerializeString(pcxt, PARALLEL_COPY_KEY_DELIM, cstate->delim);
SerializeString(pcxt, PARALLEL_COPY_KEY_QUOTE, cstate->quote);
SerializeString(pcxt, PARALLEL_COPY_KEY_ESCAPE, cstate->escape);

I think in the refactoring patch we could replace all the cstate
variables that would be shared between the leader and workers with a
common structure which would be used even for a serial copy. Thoughts?

--

Have you tested your patch when encoding conversion is needed? If so,
could you please point out the email that has the test results.

--

Apart from above, I've noticed some cosmetic errors which I am sharing here:

+#define IsParallelCopy() (cstate->is_parallel)
+#define IsLeader() (cstate->pcdata->is_leader)

This doesn't look to be properly aligned.

--

+ shared_info_ptr = (ParallelCopyShmInfo *)
shm_toc_allocate(pcxt->toc, sizeof(ParallelCopyShmInfo));
+ PopulateParallelCopyShmInfo(shared_info_ptr, full_transaction_id);

..

+ /* Store shared build state, for which we reserved space. */
+ shared_cstate = (SerializedParallelCopyState
*)shm_toc_allocate(pcxt->toc, est_cstateshared);

In the first case, while typecasting you've added a space between the
typename and the function but that is missing in the second case. I
think it would be good if you could make it consistent.

Same comment applies here as well:

+ pg_atomic_uint32 line_state; /* line state */
+ uint64 cur_lineno; /* line number for error messages */
+}ParallelCopyLineBoundary;

...

+ CommandId mycid; /* command id */
+ ParallelCopyLineBoundaries line_boundaries; /* line array */
+} ParallelCopyShmInfo;

There is no space between the closing brace and the structure name in
the first case but it is in the second one. So, again this doesn't
look consistent.

I could also find this type of inconsistency in comments. See below:

+/* It can hold upto 10000 record information for worker to process. RINGSIZE
+ * should be a multiple of WORKER_CHUNK_COUNT, as wrap around cases
is currently
+ * not handled while selecting the WORKER_CHUNK_COUNT by the worker. */
+#define RINGSIZE (10 * 1000)

...

+/*
+ * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
+ * block to process to avoid lock contention. Read RINGSIZE comments before
+ * changing this value.
+ */
+#define WORKER_CHUNK_COUNT 50

You may see these kinds of errors at other places as well if you scan
through your patch.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

On Wed, Aug 19, 2020 at 11:51 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> Thanks Greg for reviewing the patch. Please find my thoughts for your comments.
>
> On Mon, Aug 17, 2020 at 9:44 AM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
> > Some further comments:
> >
> > (1) v3-0002-Framework-for-leader-worker-in-parallel-copy.patch
> >
> > +/*
> > + * Each worker will be allocated WORKER_CHUNK_COUNT of records from DSM data
> > + * block to process to avoid lock contention. This value should be divisible by
> > + * RINGSIZE, as wrap around cases is currently not handled while selecting the
> > + * WORKER_CHUNK_COUNT by the worker.
> > + */
> > +#define WORKER_CHUNK_COUNT 50
> >
> >
> > "This value should be divisible by RINGSIZE" is not a correct
> > statement (since obviously 50 is not divisible by 10000).
> > It should say something like "This value should evenly divide into
> > RINGSIZE", or "RINGSIZE should be a multiple of WORKER_CHUNK_COUNT".
> >
>
> Fixed. Changed it to RINGSIZE should be a multiple of WORKER_CHUNK_COUNT.
>
> > (2) v3-0003-Allow-copy-from-command-to-process-data-from-file.patch
> >
> > (i)
> >
> > + /*
> > + * If the data is present in current block
> > lineInfo. line_size
> > + * will be updated. If the data is spread
> > across the blocks either
> >
> > Somehow a space has been put between "lineinfo." and "line_size".
> > It should be: "If the data is present in current block
> > lineInfo.line_size will be updated"
>
> Fixed, changed it to lineinfo->line_size.
>
> >
> > (ii)
> >
> > >This is not possible because of pg_atomic_compare_exchange_u32, this
> > >will succeed only for one of the workers whose line_state is
> > >LINE_LEADER_POPULATED, for other workers it will fail. This is
> > >explained in detail above ParallelCopyLineBoundary.
> >
> > Yes, but prior to that call to pg_atomic_compare_exchange_u32(),
> > aren't you separately reading line_state and line_state, so that
> > between those reads, it may have transitioned from leader to another
> > worker, such that the read line state ("cur_line_state", being checked
> > in the if block) may not actually match what is now in the line_state
> > and/or the read line_size ("dataSize") doesn't actually correspond to
> > the read line state?
> >
> > (sorry, still not 100% convinced that the synchronization and checks
> > are safe in all cases)
> >
>
> I think that you are describing about the problem could happen in the
> following case:
> when we read curr_line_state, the value was LINE_WORKER_PROCESSED or
> LINE_WORKER_PROCESSING. Then in some cases if the leader is very fast
> compared to the workers then the leader quickly populates one line and
> sets the state to LINE_LEADER_POPULATED. State is changed to
> LINE_LEADER_POPULATED when we are checking the currr_line_state.
> I feel this will not be a problem because, Leader will populate & wait
> till some RING element is available to populate. In the meantime
> worker has seen that state is LINE_WORKER_PROCESSED or
> LINE_WORKER_PROCESSING(previous state that it read), worker has
> identified that this chunk was processed by some other worker, worker
> will move and try to get the next available chunk & insert those
> records. It will keep continuing till it gets the next chunk to
> process. Eventually one of the workers will get this chunk and process
> it.
>
> > (3) v3-0006-Parallel-Copy-For-Binary-Format-Files.patch
> >
> > >raw_buf is not used in parallel copy, instead raw_buf will be pointing
> > >to shared memory data blocks. This memory was allocated as part of
> > >BeginCopyFrom, uptil this point we cannot be 100% sure as copy can be
> > >performed sequentially like in case max_worker_processes is not
> > >available, if it switches to sequential mode raw_buf will be used
> > >while performing copy operation. At this place we can safely free this
> > >memory that was allocated
> >
> > So the following code (which checks raw_buf, which still points to
> > memory that has been pfreed) is still valid?
> >
> > In the SetRawBufForLoad() function, which is called by CopyReadLineText():
> >
> > cur_data_blk_ptr = (cstate->raw_buf) ?
> > &pcshared_info->data_blocks[cur_block_pos] : NULL;
> >
> > The above code looks a bit dicey to me. I stepped over that line in
> > the debugger when I debugged an instance of Parallel Copy, so it
> > definitely gets executed.
> > It makes me wonder what other code could possibly be checking raw_buf
> > and using it in some way, when in fact what it points to has been
> > pfreed.
> >
> > Are you able to add the following line of code, or will it (somehow)
> > break logic that you are relying on?
> >
> > pfree(cstate->raw_buf);
> > cstate->raw_buf = NULL; <=== I suggest that this line is added
> >
>
> You are right, I have debugged & verified it sets it to an invalid
> block which is not expected. There are chances this would have caused
> some corruption in some machines. The suggested fix is required, I
> have fixed it. I have moved this change to
> 0003-Allow-copy-from-command-to-process-data-from-file.patch as
> 0006-Parallel-Copy-For-Binary-Format-Files is only for Binary format
> parallel copy & that change is common change for parallel copy.
>
> I have attached new set of patches with the fixes.
> Thoughts?
>
> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-09-16 13:06:01 Re: Avoid useless retrieval of defaults and check constraints in pg_dump -a
Previous Message Michael Paquier 2020-09-16 13:01:29 Re: Proposal of new PostgreSQL Extension - PGSpiderExt