From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Consider parallel for lateral subqueries with limit |
Date: | 2020-12-01 13:43:38 |
Message-ID: | CAAaqYe_ssUwJmYkdxO0oKqsrxPB0Ktndu7i5YiThjCor7+mqOg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 30, 2020 at 7:00 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> I've been investigating parallelizing certain correlated subqueries,
> and during that work stumbled across the fact that
> set_rel_consider_parallel disallows parallel query on what seems like
> a fairly simple case.
>
> Consider this query:
>
> select t.unique1
> from tenk1 t
> join lateral (select t.unique1 from tenk1 offset 0) l on true;
>
> Current set_rel_consider_parallel sets consider_parallel=false on the
> subquery rel because it has a limit/offset. That restriction makes a
> lot of sense when we have a subquery whose results conceptually need
> to be "shared" (or at least be the same) across multiple workers
> (indeed the relevant comment in that function notes that cases where
> we could prove a unique ordering would also qualify, but punts on
> implementing that due to complexity). But if the subquery is LATERAL,
> then no such conceptual restriction.
>
> If we change the code slightly to allow considering parallel query
> even in the face of LIMIT/OFFSET for LATERAL subqueries, then our
> query above changes from the following plan:
>
> Nested Loop
> Output: t.unique1
> -> Gather
> Output: t.unique1
> Workers Planned: 2
> -> Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
> Output: t.unique1
> -> Gather
> Output: NULL::integer
> Workers Planned: 2
> -> Parallel Index Only Scan using tenk1_hundred on public.tenk1
> Output: NULL::integer
>
> to this plan:
>
> Gather
> Output: t.unique1
> Workers Planned: 2
> -> Nested Loop
> Output: t.unique1
> -> Parallel Index Only Scan using tenk1_unique1 on public.tenk1 t
> Output: t.unique1
> -> Index Only Scan using tenk1_hundred on public.tenk1
> Output: NULL::integer
>
> The code change itself is quite simple (1 line). As far as I can tell
> we don't need to expressly check parallel safety of the limit/offset
> expressions; that appears to happen elsewhere (and that makes sense
> since the RTE_RELATION case doesn't check those clauses either).
>
> If I'm missing something about the safety of this (or any other
> issue), I'd appreciate the feedback.
Note that near the end of grouping planner we have a similar check:
if (final_rel->consider_parallel && root->query_level > 1 &&
!limit_needed(parse))
guarding copying the partial paths from the current rel to the final
rel. I haven't managed to come up with a test case that exposes that
though since simple examples like the one above get converted into a
JOIN, so we're not in grouping_planner for a subquery. Making the
subquery above correlated results in us getting to that point, but
isn't currently marked as parallel safe for other reasons (because it
has params), so that's not a useful test. I'm not sure if there are
cases where we can't convert to a join but also don't involve params;
haven't thought about it a lot though.
James
From | Date | Subject | |
---|---|---|---|
Next Message | Anastasia Lubennikova | 2020-12-01 13:55:49 | Re: Terminate the idle sessions |
Previous Message | Ashutosh Bapat | 2020-12-01 13:17:50 | Re: Cost overestimation of foreign JOIN |