Quick Links

Re: Parallel copy

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	vignesh C <vignesh21(at)gmail(dot)com>
Cc:	Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alastair Turner <minion(at)decodable(dot)me>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Parallel copy
Date:	2020-09-28 06:49:58
Message-ID:	CAA4eK1LE3Obwg+mTm3Hm3ZUeN=jL_RcJOOEk_XVucEqC=8hCyA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 22, 2020 at 7:48 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Tue, Jul 21, 2020 at 3:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
>
> > Review comments:
> > ===================
> >
> > 0001-Copy-code-readjustment-to-support-parallel-copy
> > 1.
> > @@ -807,8 +835,11 @@ CopyLoadRawBuf(CopyState cstate)
> > else
> > nbytes = 0; /* no data need be saved */
> >
> > + if (cstate->copy_dest == COPY_NEW_FE)
> > + minread = RAW_BUF_SIZE - nbytes;
> > +
> > inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
> > - 1, RAW_BUF_SIZE - nbytes);
> > + minread, RAW_BUF_SIZE - nbytes);
> >
> > No comment to explain why this change is done?
> >
> > 0002-Framework-for-leader-worker-in-parallel-copy
>
> Currently CopyGetData copies a lesser amount of data to buffer even though space is available in buffer because minread was passed as 1 to CopyGetData. Because of this there are frequent call to CopyGetData for fetching the data. In this case it will load only some data due to the below check:
> while (maxread > 0 && bytesread < minread && !cstate->reached_eof)
> After reading some data bytesread will be greater than minread which is passed as 1 and return with lesser amount of data, even though there is some space.
> This change is required for parallel copy feature as each time we get a new DSM data block which is of 64K size and copy the data. If we copy less data into DSM data blocks we might end up consuming all the DSM data blocks.
>

Why can't we reuse the DSM block which has unfilled space?

> I felt this issue can be fixed as part of HEAD. Have posted a separate thread [1] for this. I'm planning to remove that change once it gets committed. Can that go as a separate
> patch or should we include it here?
> [1] - https://www.postgresql.org/message-id/CALDaNm0v4CjmvSnftYnx_9pOS_dKRG%3DO3NnBgJsQmi0KipvLog%40mail.gmail.com
>

I am convinced by the reason given by Kyotaro-San in that another
thread [1] and performance data shown by Peter that this can't be an
independent improvement and rather in some cases it can do harm. Now,
if you need it for a parallel-copy path then we can change it
specifically to the parallel-copy code path but I don't understand
your reason completely.

> > 2.
..
> > + */
> > +typedef struct ParallelCopyLineBoundary
> >
> > Are we doing all this state management to avoid using locks while
> > processing lines? If so, I think we can use either spinlock or LWLock
> > to keep the main patch simple and then provide a later patch to make
> > it lock-less. This will allow us to first focus on the main design of
> > the patch rather than trying to make this datastructure processing
> > lock-less in the best possible way.
> >
>
> The steps will be more or less same if we use spinlock too. step 1, step 3 & step 4 will be common we have to use lock & unlock instead of step 2 & step 5. I feel we can retain the current implementation.
>

I'll study this in detail and let you know my opinion on the same but
in the meantime, I don't follow one part of this comment: "If they
don't follow this order the worker might process wrong line_size and
leader might populate the information which worker has not yet
processed or in the process of processing."

Do you want to say that leader might overwrite some information which
worker hasn't read yet? If so, it is not clear from the comment.
Another minor point about this comment:

+ * ParallelCopyLineBoundary is common data structure between leader & worker,
+ * Leader process will be populating data block, data block offset &
the size of

I think there should be a full-stop after worker instead of a comma.

>
> > 6.
> > In function BeginParallelCopy(), you need to keep a provision to
> > collect wal_usage and buf_usage stats. See _bt_begin_parallel for
> > reference. Those will be required for pg_stat_statements.
> >
>
> Fixed
>

How did you ensure that this is fixed? Have you tested it, if so
please share the test? I see a basic problem with your fix.

+ /* Report WAL/buffer usage during parallel execution */
+ bufferusage = shm_toc_lookup(toc, PARALLEL_COPY_BUFFER_USAGE, false);
+ walusage = shm_toc_lookup(toc, PARALLEL_COPY_WAL_USAGE, false);
+ InstrEndParallelQuery(&bufferusage[ParallelWorkerNumber],
+ &walusage[ParallelWorkerNumber]);

You need to call InstrStartParallelQuery() before the actual operation
starts, without that stats won't be accurate? Also, after calling
WaitForParallelWorkersToFinish(), you need to accumulate the stats
collected from workers which neither you have done nor is possible
with the current code in your patch because you haven't made any
provision to capture them in BeginParallelCopy.

I suggest you look into lazy_parallel_vacuum_indexes() and
begin_parallel_vacuum() to understand how the buffer/wal usage stats
are accumulated. Also, please test this functionality using
pg_stat_statements.

>
> > 0003-Allow-copy-from-command-to-process-data-from-file-ST
> > 10.
> > In the commit message, you have written "The leader does not
> > participate in the insertion of data, leaders only responsibility will
> > be to identify the lines as fast as possible for the workers to do the
> > actual copy operation. The leader waits till all the lines populated
> > are processed by the workers and exits."
> >
> > I think you should also mention that we have chosen this design based
> > on the reason "that everything stalls if the leader doesn't accept
> > further input data, as well as when there are no available splitted
> > chunks so it doesn't seem like a good idea to have the leader do other
> > work. This is backed by the performance data where we have seen that
> > with 1 worker there is just a 5-10% (or whatever percentage difference
> > you have seen) performance difference)".
>
> Fixed.
>

Make it a one-paragraph starting from "The leader does not participate
in the insertion of data .... just a 5-10% performance difference".
Right now both the parts look a bit disconnected.

Few additional comments:
======================
v5-0001-Copy-code-readjustment-to-support-parallel-copy
---------------------------------------------------------------------------------
1.
+/*
+ * CLEAR_EOL_LINE - Wrapper for clearing EOL.
+ */
+#define CLEAR_EOL_LINE() \
+if (!result && !IsHeaderLine()) \
+ ClearEOLFromCopiedData(cstate, cstate->line_buf.data, \
+ cstate->line_buf.len, \
+ &cstate->line_buf.len) \

I don't like this macro. I think it is sufficient to move the common
code to be called from the parallel and non-parallel path in
ClearEOLFromCopiedData but I think the other checks can be done
in-place. I think having macros for such a thing makes code less
readable.

2.
-
+static void PopulateCommonCstateInfo(CopyState cstate, TupleDesc tup_desc,
+ List *attnamelist);

Spurious line removal.

v5-0002-Framework-for-leader-worker-in-parallel-copy
---------------------------------------------------------------------------
3.
+ FullTransactionId full_transaction_id; /* xid for copy from statement */
+ CommandId mycid; /* command id */
+ ParallelCopyLineBoundaries line_boundaries; /* line array */
+} ParallelCopyShmInfo;

We already serialize FullTransactionId and CommandId via
InitializeParallelDSM->SerializeTransactionState. Can't we reuse it? I
think recently Parallel Insert patch has also done something for this
[2] so you can refer that if you want.

v5-0004-Documentation-for-parallel-copy
-----------------------------------------------------------
1. Perform <command>COPY FROM</command> in parallel using <replaceable
+ class="parameter"> integer</replaceable> background workers.

No need for space before integer.

[1] - https://www.postgresql.org/message-id/20200911.155804.359271394064499501.horikyota.ntt%40gmail.com
[2] - https://www.postgresql.org/message-id/CAJcOf-fn1nhEtaU91NvRuA3EbvbJGACMd4_c%2BUu3XU5VMv37Aw%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Re: Parallel copy at 2020-07-22 14:17:57 from vignesh C

Responses

Re: Parallel copy at 2020-09-29 13:00:34 from Amit Kapila
Re: Parallel copy at 2020-10-07 18:44:00 from vignesh C

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2020-09-28 06:53:45	Re: Yet another fast GiST build
Previous Message	Amit Langote	2020-09-28 06:49:53	Re: Report error position in partition bound check