Re: Parallelize correlated subqueries that execute within each worker

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Parallelize correlated subqueries that execute within each worker
Date: 2023-01-22 03:07:58
Message-ID: CAAaqYe_ByNJxaCYqN6uKKeyTpJWkEXBoYrCjvAQGNrVE5ZEYzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 18, 2023 at 9:34 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Wed, Jan 18, 2023 at 2:09 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >
> > Hi,
> >
> > This patch hasn't been updated since September, and it got broken by
> > 4a29eabd1d91c5484426bc5836e0a7143b064f5a which the incremental sort
> > stuff a little bit. But the breakage was rather limited, so I took a
> > stab at fixing it - attached is the result, hopefully correct.
>
> Thanks for fixing this up; the changes look correct to me.
>
> > I also added a couple minor comments about stuff I noticed while
> > rebasing and skimming the patch, I kept those in separate commits.
> > There's also a couple pre-existing TODOs.
>
> I started work on some of these, but wasn't able to finish this
> evening, so I don't have an updated series yet.
>
> > James, what's your plan with this patch. Do you intend to work on it for
> > PG16, or are there some issues I missed in the thread?
>
> I'd love to see it get into PG16. I don't have any known issues, but
> reviewing activity has been light. Originally Robert had had some
> concerns about my original approach; I think my updated approach
> resolves those issues, but it'd be good to have that sign-off.
>
> Beyond that I'm mostly looking for review and evaluation of the
> approach I've taken; of note is my description of that in [1].

Here's an updated patch version incorporating feedback from Tomas as
well as some additional comments and tweaks.

While working through Tomas's comment about a conditional in the
max_parallel_hazard_waker being guaranteed true I realized that in the
current version of the patch the safe_param_ids tracking in
is_parallel_safe isn't actually needed any longer. That seemed
suspicious, and so I started digging, and I found out that in the
current approach all of the tests pass with only the changes in
clauses.c. I don't believe that the other changes aren't needed;
rather I believe there isn't yet a test case exercising them, but I
realize that means I can't prove they're needed. I spent some time
poking at this, but at least with my current level of imagination I
haven't stumbled across a query that would exercise these checks. One
of the reasons I'm fairly confident that this is true is that the
original approach (which was significantly more invasive) definitely
required rechecking parallel safety at each level until we reached the
point where the subquery was known to be generated within the current
worker through the safe_param_ids tracking mechanism. Of course it is
possible that that complexity is actually required and this simplified
approach isn't feasible (but I don't have a good reason to suspect
that currently). It's also possible that the restrictions on
subqueries just aren't necessary...but that isn't compelling because
it would require proving that you can never have a query level with
as-yet unsatisfied lateral rels.

Note: All of the existing tests for "you can't parallelize a
correlated subquery" are all simple versions which are not actually
parallel unsafe in theory. I assume they were added to show that the
code excluded that broad case, and there wasn't any finer grain of
detail required since the code simply didn't support making the
decision with that granularity anyway. But that means there weren't
any existing test cases to exercise the granularity I'm now trying to
achieve.

If someone is willing to help out what I'd like help with currently is
finding such a test case (where a gather or gather merge path would
otherwise be created but at the current plan level not all of the
required lateral rels are yet available -- meaning that we can't
perform all of the subqueries within the current worker). In support
of that patch 0004 converts several of the new parallel safety checks
into WARNING messages instead to make it easy to see if a query
happens to encounter any of those checks.

James Coleman

Attachment Content-Type Size
v7-0004-Warning-messages-for-finding-test-cases.patch application/octet-stream 3.3 KB
v7-0001-Add-tests-before-change.patch application/octet-stream 7.2 KB
v7-0003-Possible-additional-checks.patch application/octet-stream 3.0 KB
v7-0002-Parallelize-correlated-subqueries.patch application/octet-stream 30.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-01-22 03:19:15 Re: run pgindent on a regular basis / scripted manner
Previous Message Andres Freund 2023-01-22 02:54:35 Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation