Re: parallelize queries containing subplans

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelize queries containing subplans
Date: 2017-02-02 03:53:57
Message-ID: CAA4eK1KUWmZJQKTp+AHiBbmBYJ8osp2w4txtYjDQB=K8F51RVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 1, 2017 at 10:04 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jan 31, 2017 at 6:01 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Moved this patch to next CF.
>
> So here is what seems to be the key hunk from this patch:
>
> /*
> - * Since we don't have the ability to push subplans down to workers at
> - * present, we treat subplan references as parallel-restricted. We need
> - * not worry about examining their contents; if they are unsafe, we would
> - * have found that out while examining the whole tree before reduction of
> - * sublinks to subplans. (Really we should not see SubLink during a
> - * max_interesting == restricted scan, but if we do, return true.)
> + * We can push the subplans only if they don't contain any parallel-aware
> + * node as we don't support multi-level parallelism (parallel workers
> + * invoking another set of parallel workers).
> */
> - else if (IsA(node, SubLink) ||
> - IsA(node, SubPlan) ||
> - IsA(node, AlternativeSubPlan))
> + else if (IsA(node, SubPlan))
> + return !((SubPlan *) node)->parallel_safe;
> + else if (IsA(node, AlternativeSubPlan))
> {
> - if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
> - return true;
> + AlternativeSubPlan *asplan = (AlternativeSubPlan *) node;
> + ListCell *lc;
> +
> + foreach(lc, asplan->subplans)
> + {
> + SubPlan *splan = (SubPlan *) lfirst(lc);
> +
> + Assert(IsA(splan, SubPlan));
> +
> + if (max_parallel_hazard_walker((Node *) splan, context))
> + return true;
> + }
> +
> + return false;
> }
>
> I don't see the reason for having this new code that makes
> AlternativeSubPlan walk over the subplans; expression_tree_walker
> already does that.
>

It is done this way mainly to avoid recursion/nested calls, but I
think you prefer to handle it via expression_tree_walker so that code
looks clean. Another probable reason could be that there is always a
chance that we miss handling of some expression of a particular node
(in this case AlternativeSubPlan), but if that is the case then there
are other places in the code which do operate on individual subplan/'s
in AlternativeSubPlan list.

> On the flip side I don't see the reason for
> removing the max_parallel_hazard_test() call for AlternativeSubPlan or
> for SubLink.

If we don't remove the current test of max_parallel_hazard_test() for
AlternativeSubPlan, then AlternativeSubPlan node will be considered
parallel restricted, so why do we want that after this patch. For
Sublinks, I think they would have converted to Subplans by the time we
reach this function for the parallel restricted check. Can you
elaborate what you have in mind for the handling of AlternativeSubPlan
and SubLink?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2017-02-02 03:57:30 Re: WIP: [[Parallel] Shared] Hash
Previous Message David Rowley 2017-02-02 03:50:04 Re: pg_authid.rolpassword format (was Re: Password identifiers, protocol aging and SCRAM protocol)