Re: parallelize queries containing initplans

From: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelize queries containing initplans
Date: 2017-08-09 13:21:30
Message-ID: CAJrrPGfQ=0jdC5W3D-i8CyeXJt4_35HETBwyRUBbN_x4QBJbtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 9, 2017 at 9:26 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Aug 9, 2017 at 10:24 AM, Haribabu Kommi
> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
> >
> >
> > For the following query the parallel plan is not chosen. The query
> contains
> > an init plan that refer the outer node.
> >
>
> We don't want to generate the parallel plan for such cases. Whenever
> initplan refers to any outer node (aka correlated plan), it won't
> generate a parallel plan. Also, for t2, it doesn't choose a parallel
> plan because one-time filter refers to the outer node (again
> correlated plan case). Basically, till now we don't support parallel
> plan for any case where the correlated plan is used. So, it is
> perfectly valid that it doesn't use parallel plan here.

Thanks for providing the details.

> Here if you notice the parallel node t2 refers to the initplan which
> can be parallelised after this patch. Basically, whenever the
> initplan is attached at or above Gather node, we compute its value and
> pass down to workers.
>

Thanks for the details. I checked the code also.

By the way, I tested the patch with by DML support for parallel patch to
check the returning of clause of insert, and all the returning clause init
plans
are parallel plans with this patch.

> By the way, the patch doesn't apply on HEAD, so attached rebased patch.

Thanks for the updated patch. I have some comments and I am yet to finish
the review.

+ /*
+ * We don't want to generate gather or gather merge node if there are
+ * initplans at some query level below the current query level as those
+ * plans could be parallel-unsafe or undirect correlated plans. Ensure to
+ * mark all the partial and non-partial paths for a relation at this level
+ * to be parallel-unsafe.
+ */
+ if (is_initplan_below_current_query_level(root))
+ {
+ foreach(lc, rel->partial_pathlist)
+ {
+ Path *subpath = (Path *) lfirst(lc);
+
+ subpath->parallel_safe = false;
+ }
+
+ foreach(lc, rel->pathlist)
+ {
+ Path *subpath = (Path *) lfirst(lc);
+
+ subpath->parallel_safe = false;
+ }
+ return;
+ }
+

The above part of the code is almost same in mutiple places, is it possible
to change to function?

+ node->initParam = NULL;

This new parameter is set to NULL in make_gather function, the same
parameter
is added to GatherMerge structure also, but anyway this parameter is set to
NULL
makeNode macro, why only setting it to here, but not the other place.

Do we need to set it to default value such as NULL or false if it is
already the same value?
This is not related to the above parameter, for all existing parameters
also.

+ if (IsA(plan, Gather))
+ ((Gather *) plan)->initParam = bms_intersect(plan->lefttree->extParam,
initSetParam);
+ else if (IsA(plan, GatherMerge))
+ ((GatherMerge *) plan)->initParam =
bms_intersect(plan->lefttree->extParam, initSetParam);
+ else
+ elog(ERROR, "unrecognized node type: %d", nodeTag(plan));

The else case is not possible, because it is already validated for Gather
or GatherMerge.
Can we change it simple if and else?

Regards,
Hari Babu
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-08-09 13:23:37 Re: Walsender timeouts and large transactions
Previous Message Jeremy Finzel 2017-08-09 13:14:22 Review of DDL replication solution