Re: Gather Merge

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Gather Merge
Date: 2016-10-18 11:59:41
Message-ID: CAGPqQf3-_iFsd6MD9cy16JaAYuHPqWjbt6+zsrD418ngXSZCeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks Amit for reviewing this patch.

On Mon, Oct 17, 2016 at 2:26 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Wed, Oct 5, 2016 at 11:35 AM, Rushabh Lathia
> <rushabh(dot)lathia(at)gmail(dot)com> wrote:
> > Hi hackers,
> >
> > Attached is the patch to implement Gather Merge.
> >
>
> Couple of review comments:
>
> 1.
> ExecGatherMerge()
> {
> ..
> + /* No workers? Then never mind. */
> + if (!got_any_worker
> ||
> + node->nreaders < 2)
> + {
> +
> ExecShutdownGatherMergeWorkers(node);
> + node->nreaders = 0;
> +
> }
>
> Are you planning to restrict the use of gather merge based on number
> of workers, if there is a valid reason, then I think comments should
> be updated for same?
>
>
Thanks for catching this. This is left over from the earlier design patch.
With
current design we don't have any limitation for the number of worker . I did
the performance testing with setting max_parallel_workers_per_gather to 1
and didn't noticed any performance degradation. So I removed this limitation
into attached patch.

2.
> +ExecGatherMerge(GatherMergeState * node){
> ..
> + if (!node->initialized)
> + {
> + EState *estate = node->ps.state;
> +
> GatherMerge *gm = (GatherMerge *) node->ps.plan;
> +
> + /*
> + * Sometimes we
> might have to run without parallelism; but if parallel
> + * mode is active then we can try to
> fire up some workers.
> + */
> + if (gm->num_workers > 0 && IsInParallelMode())
> +
> {
> + ParallelContext *pcxt;
> + bool got_any_worker =
> false;
> +
> + /* Initialize the workers required to execute Gather node. */
> +
> if (!node->pei)
> + node->pei = ExecInitParallelPlan(node-
> >ps.lefttree,
> +
> estate,
> +
> gm->num_workers);
> ..
> }
>
> There is lot of common code between ExecGatherMerge and ExecGather.
> Do you think it makes sense to have a common function to avoid the
> duplicity?
>
> I see there are small discrepancies in both the codes like I don't see
> the use of single_copy flag, as it is present in gather node.
>
>
Yes, even I thought to centrilize some things of ExecGather and
ExecGatherMerge,
but its really not something that is fixed. And I thought it might change
particularly
for the Gather Merge. And as explained by Robert single_copy doesn't make
sense
for the Gather Merge. I will still look into this to see if something can
be make
centralize.

> 3.
> +gather_merge_readnext(GatherMergeState * gm_state, int reader, bool
> force)
> {
> ..
> + tup = gm_readnext_tuple(gm_state, reader, force, NULL);
> +
> + /*
> +
> * try to read more tuple into nowait mode and store it into the tuple
> + * array.
> +
> */
> + if (HeapTupleIsValid(tup))
> + fill_tuple_array(gm_state, reader);
>
> How the above read tuple is stored in array? In anycase the above
> interface seems slightly awkward to me. Basically, I think what you
> are trying to do here is after reading first tuple in waitmode, you
> are trying to fill the array by reading more tuples. So, can't we
> push reading of this fist tuple into that function and name it as
> form_tuple_array().
>
>
Yes, you are right. First its trying to read tuple into wait-mode, and once
it
find tuple then its try to fill the tuple array (which basically try to
read tuple
into nowait-mode). Reason I keep it separate is because in case of
initializing
the gather merge, if we unable to read tuple from all the worker - while
trying
re-read, we again try to fill the tuple array for the worker who already
produced
atleast a single tuple (see gather_merge_init() for more details). Also I
thought
filling tuple array (which basically read tuple into nowait mode) and
reading tuple
(into wait-mode) are two separate task - and if its into separate function
that code
look more clear. If you have any suggestion for the function name
(fill_tuple_array)
then I am open to change that.

> 4.
> +create_gather_merge_path(..)
> {
> ..
> + 0 /* FIXME: pathnode->limit_tuples*/);
>
> What exactly you want to fix in above code?
>
>
Fixed.

> 5.
> +/* Tuple array size */
> +#define MAX_TUPLE_STORE 10
>
> Have you tried with other values of MAX_TUPLE_STORE? If yes, then
> what are the results? I think it is better to add a comment why array
> size is best for performance.
>
>

Actually I was thinking on that, but I don't wanted to add their because
its just
performance number on my machine. Anyway I added the comments.

>
> 6.
> +/* INTERFACE ROUTINES
> + * ExecInitGatherMerge - initialize the MergeAppend
> node
> + * ExecGatherMerge - retrieve the next tuple from the node
> + *
> ExecEndGatherMerge - shut down the MergeAppend node
> + *
> ExecReScanGatherMerge - rescan the MergeAppend node
>
> typo. /MergeAppend/GatherMerge
>
>
Fixed.

> 7.
> +static TupleTableSlot *gather_merge_getnext(GatherMergeState * gm_state);
> +static HeapTuple
> gm_readnext_tuple(GatherMergeState * gm_state, int nreader, bool
> force, bool *done);
>
> Formatting near GatherMergeState doesn't seem to be appropriate. I
> think you need to add GatherMergeState in typedefs.list and then run
> pgindent again.
>
>
Fixed.

> 8.
> + /*
> + * Initialize funnel slot to same tuple descriptor as outer plan.
> + */
> + if
> (!ExecContextForcesOids(&gm_state->ps, &hasoid))
>
> I think in above comment, you mean Initialize GatherMerge slot.
>
>
No, it has to be funnel slot only - its just place holder. For Gather
Merge, initialize
one slot per worker and it is done into gather_merge_init(). I will look
into this point
to see if I can get rid of funnel slot completely.

9.
> + /* Does tuple array have any avaiable tuples? */
> /avaiable/available
>
>
Fixed.

> >
> > Open Issue:
> >
> > - Commit af33039317ddc4a0e38a02e2255c2bf453115fd2 fixed the leak into
> > tqueue.c by
> > calling gather_readnext() into per-tuple context. Now for gather merge
> that
> > is
> > not possible, as we storing the tuple into Tuple array and we want tuple
> to
> > be
> > free only its get pass through the merge sort algorithm. One idea is, we
> can
> > also call gm_readnext_tuple() under per-tuple context (which will fix the
> > leak
> > into tqueue.c) and then store the copy of tuple into tuple array.
> >
>
> Won't extra copy per tuple impact performance? Is the fix in
> mentioned commit was for record or composite types, if so, does
> GatherMerge support such types and if it support, does it provide any
> benefit over Gather?
>
>
I don't think was specificially for the record or composite types - but I
might be
wrong. As per my understanding commit fix leak into tqueue.c. Fix was to add
standard to call TupleQueueReaderNext() with shorter memory context - so
that
tqueue.c doesn't leak the memory.

I have idea to fix this by calling the TupleQueueReaderNext() with
per-tuple context,
and then copy the tuple and store it to the tuple array and later with the
next run of
ExecStoreTuple() will free the earlier tuple. I will work on that.

> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

--
Rushabh Lathia

Attachment Content-Type Size
gather_merge_v2.patch application/x-download 49.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-10-18 12:02:12 Re: Parallel Index Scans
Previous Message Rahila Syed 2016-10-18 10:38:21 Re: Parallel Index Scans