Re: Parallelize correlated subqueries that execute within each worker

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Parallelize correlated subqueries that execute within each worker
Date: 2023-06-06 09:35:51
Message-ID: CAMbWs4_evjcMzN8Gw78bHfhfo2FKJThqhEjRJRmoMZx=NXcJ7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2023 at 10:00 PM James Coleman <jtc331(at)gmail(dot)com> wrote:

> Which this patch we do in fact now see (as expected) rels with
> non-empty lateral_relids showing up in generate_[useful_]gather_paths.
> And the partial paths can now have non-empty required outer rels. I'm
> not able to come up with a plan that would actually be caught by those
> checks; I theorize that because of the few places we actually call
> generate_[useful_]gather_paths we are in practice already excluding
> those, but for now I've left these as a conditional rather than an
> assertion because it seems like the kind of guard we'd want to ensure
> those methods are safe.

I'm trying to understand this part. AFAICS we will not create partial
paths for a rel, base or join, if it has lateral references. So it
seems to me that in generate_[useful_]gather_paths after we've checked
that there are partial paths, the checks for lateral_relids are not
necessary because lateral_relids should always be empty in this case.
Maybe I'm missing something.

And while trying the v9 patch I came across a crash with the query
below.

set min_parallel_table_scan_size to 0;
set parallel_setup_cost to 0;
set parallel_tuple_cost to 0;

explain (costs off)
select * from pg_description t1 where objoid in
(select objoid from pg_description t2 where t2.description =
t1.description);
QUERY PLAN
--------------------------------------------------------
Seq Scan on pg_description t1
Filter: (SubPlan 1)
SubPlan 1
-> Gather
Workers Planned: 2
-> Parallel Seq Scan on pg_description t2
Filter: (description = t1.description)
(7 rows)

select * from pg_description t1 where objoid in
(select objoid from pg_description t2 where t2.description =
t1.description);
WARNING: terminating connection because of crash of another server process

Seems something is wrong when extracting the argument from the Param in
parallel worker.

BTW another rebase is needed as it no longer applies to HEAD.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-06-06 09:56:16 Re: TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: "reorderbuffer.c", Line: 927, PID: 568639)
Previous Message vignesh C 2023-06-06 09:12:38 Re: Support logical replication of DDLs