| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> | 
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Parallel safety for extern params | 
| Date: | 2017-10-16 10:59:42 | 
| Message-ID: | CAA4eK1KbJwnh6Rhv=zd6gcuA7nNtXC0s-KBfJfExEJX6mpJwCA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sat, Oct 14, 2017 at 1:51 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Oct 13, 2017 at 1:19 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> After fixing this problem, when I ran the regression tests with
>> force_parallel_mode = regress, I saw multiple other failures.  All the
>> failures boil down to two kinds of cases:
>>
>> 1. There was an assumption while extracting simple expressions that
>> the target list of gather node can contain constants or Var's.  Now,
>> once the above patch allows extern params as parallel-safe, that
>> assumption no longer holds true.  We need to handle params as well.
>> Attached patch fix_simple_expr_interaction_gather_v1.patch handles
>> that case.
>
> -     * referencing the child node's output ... but setrefs.c might also have
> -     * copied a Const as-is.
> +     * referencing the child node's output or a Param... but setrefs.c might
> +     * also have copied a Const as-is.
>
> I think the Param case should be mentioned after "... but" not before
> - i.e. referencing the child node's output... but setrefs.c might also
> have copied a Const or Param is-is.
>
I am not sure if we can write the comment like that (.. copied a Const
or Param as-is.) because fix_upper_expr_mutator in setrefs.c has a
special handling for Var and Param where constants are copied as-is
via expression_tree_mutator.  Also as proposed, the handling for
params is more like Var in exec_save_simple_expr.
>> 2. We don't allow execution to use parallelism if the plan can be
>> executed multiple times.  This has been enforced in ExecutePlan, but
>> it seems like that we miss to handle the case where we are already in
>> parallel mode by the time we enforce that condition.  So, what
>> happens, as a result, is that it will allow to use parallelism when it
>> shouldn't (when the same plan is executed multiple times) and lead to
>> a crash.  One way to fix is that we temporarily exit the parallel mode
>> in such cases and reenter in the same state once the current execution
>> is over. Attached patch fix_parallel_mode_nested_execution_v1.patch
>> fixes this problem.
>
> This seems completely unsafe.  If somebody's already entered parallel
> mode, they are counting on having the parallel-mode restrictions
> enforced until they exit parallel mode.  We can't just disable those
> restrictions for a while in the middle and then put them back.
>
Right.
> I think the bug is in ExecGather(Merge): it assumes that if we're in
> parallel mode, it's OK to start workers.  But actually, it shouldn't
> do this unless ExecutePlan ended up with use_parallel_mode == true,
> which isn't quite the same thing.  I think we might need ExecutePlan
> to set a flag in the estate that ExecGather(Merge) can check.
>
Attached patch fixes the problem in a suggested way.
-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
| Attachment | Content-Type | Size | 
|---|---|---|
| fix_parallel_mode_nested_execution_v2.patch | application/octet-stream | 2.5 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2017-10-16 11:13:52 | Re: PATCH: enabling parallel execution for cursors explicitly (experimental) | 
| Previous Message | Magnus Hagander | 2017-10-16 10:55:53 | Re: Determine state of cluster (HA) |