Re: segfault with incremental sort

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, luis(dot)roberto(at)siscobra(dot)com(dot)br, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>, "alan(dot)formagi" <alan(dot)formagi(at)siscobra(dot)com(dot)br>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: segfault with incremental sort
Date: 2020-11-25 14:06:25
Message-ID: CAA4eK1LG4=cNfciySE7ZmfFK9Cxmx_pzynTQV8pA_vC8T1N5ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Nov 25, 2020 at 7:57 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Mon, Nov 23, 2020 at 9:44 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 23, 2020 at 7:23 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > >
> >
> > Yeah, this doesn't require any communication between leader and worker
> > but to enable it for such cases, we need to identify when we have
> > correlated subquery where we can make it parallel-safe and then
> > probably allow it. IIRC, we only allow cases of un-correlated subplans
> > and initplans when those are at the same or a level above the Gather
> > (Merge) node.
>
> In principle do you see any reason why given:
>
> select distinct
> unique1,
> (select t.unique1 from tenk1 where tenk1.unique1 = t.unique1)
> from tenk1 t, generate_series(1, 1000);
>
> it wouldn't be valid to go from the current (with patch from [1]) plan of:
>
> Unique
> -> Sort
> Sort Key: t.unique1, ((SubPlan 1))
> -> Gather
> Workers Planned: 2
> -> Nested Loop
> -> Parallel Index Only Scan using tenk1_unique1 on tenk1 t
> -> Function Scan on generate_series
> SubPlan 1
> -> Index Only Scan using tenk1_unique1 on tenk1
> Index Cond: (unique1 = t.unique1)
>
> to this plan?
>
> Unique
> -> Gather Merge
> Workers Planned: 2
> -> Sort
> Sort Key: t.unique1, ((SubPlan 1))
> -> Nested Loop
> -> Parallel Index Only Scan using tenk1_unique1 on tenk1 t
> -> Function Scan on generate_series
> SubPlan 1
> -> Index Only Scan using tenk1_unique1 on tenk1
> Index Cond: (unique1 = t.unique1)
>
> My intuition is that it would be safe (not just in the parallel sense
> but also in the sense of safety for pushing down expression evaluation
> into the target list to push the sort down), but I want to make sure
> that's correct before proceeding.
>

I don't see any problem with respect to parallel-safety but why do we
want to generate parallel-plans for correlated sub-queries without
doing more analysis on which kind of cases it would be safe apart from
this particular case?

> But I also have a bigger question:
>
> I've been stepping through this in the debugger, and have noticed that
> the only reason we aren't selecting the second plan (again, with the
> fix from [1]) is that the plan for "Index Only Scan using
> tenk1_unique1 on tenk1" when passed into build_subplan has
> "plan->parallel_safe = false". Earlier "consider_parallel = false" has
> been set on the path by set_rel_consider_parallel because
> is_parallel_safe doesn't find any safe_param_ids, and thus the
> correlated subquery having a PARAM_EXEC param fails the safe_param_ids
> member check in max_parallel_hazard_walker since the param isn't from
> this level.
>
> So far that's correct, but then when we come back around to checking
> parallel safety of the expression for a parallel sort we call
> is_parallel_safe, and the max_parallel_hazard_walker SubPlan case
> checks subplan->parallel_safe, finds it false
> ("max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context)" returns
> true), and thus doesn't proceed to actually checking the subplan's
> testexpr or args.
>
> That seems a bit of miss to me, because while such a subplan is
> parallel unsafe in isolation, it is not in this context.
>

It is possible but we don't that the context unless subplan is marked
parallel-safe. I think if we need to generate parallel-safe plans for
correlated queries, we might want to see how the subplan will be
marked parallel-safe.

> That is, if
> we re-run the checks on testexpr and args in this context, then we'll
> not find anything parallel unsafe about them (the param can safely be
> found in the current context), so we'll be free to execute the subplan
> in workers.
>
> > Now, I think it is quite possible that in PG-13 we have
> > unintentionally allowed plans containing correlated sub-queries but
> > missed to take care of it at all required places.
>
> Yes, we're discussing a fix in [1] for PG13's missing checks for
> parallel safety here.
>

So, are you trying to make such plans (which have correlated
sub-queries) parallel-safe or parallel-unsafe?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message James Coleman 2020-11-25 14:57:37 Re: segfault with incremental sort
Previous Message Oleksandr Shulgin 2020-11-25 09:25:29 Re: BUG #16743: psql doesn't show whole expression in stored column