Re: Allowing parallel-safe initplans

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Allowing parallel-safe initplans
Date: 2023-07-13 21:44:19
Message-ID: 3164093.1689284659@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Eventually I noticed that all the failing cases were instances of
> optimizing MIN()/MAX() aggregates into indexscans, and then I figured
> out what the problem is: we substitute Params for the optimized-away
> Aggref nodes in setrefs.c, *after* SS_finalize_plan has been run.
> That means we fail to account for those Params in extParam/allParam
> sets. We've gotten away with that up to now because such Params
> could only appear where Aggrefs can appear, which is only in top-level
> (above scans and joins) nodes, which generally don't have any of the
> sorts of rescan optimizations that extParam/allParam bits control.
> But this patch results in needing to have a correct extParam set for
> the node just below Gather, and we don't. I am not sure whether there
> are any reachable bugs without this patch; but there might be, or some
> future optimization might introduce one.

> It seems like the cleanest fix for this is to replace such optimized
> Aggrefs in a separate tree scan before running SS_finalize_plan.
> That's fairly annoying from a planner-runtime standpoint, although
> we could skip the extra pass in the typical case where no minmax aggs
> have been optimized.
> I also thought about swapping the order of operations so that we
> run SS_finalize_plan after setrefs.c. That falls down because of
> set_param_references itself, which requires those bits to be
> calculated already. But maybe we could integrate that computation
> into SS_finalize_plan instead? There's certainly nothing very
> pretty about the way it's done now.

I tried both of those and concluded they'd be too messy for a patch
that we might find ourselves having to back-patch. So 0001 attached
fixes it by teaching SS_finalize_plan to treat optimized MIN()/MAX()
aggregates as if they were already Params. It's slightly annoying
to have knowledge of that optimization metastasizing into another
place, but the alternatives are even less palatable.

Having done that, if you adjust 0002 to inject Gathers even when
debug_parallel_query = regress, the only diffs in the core regression
tests are that some initPlans disappear from EXPLAIN output. The
outputs of the actual queries are still correct, demonstrating that
e89a71fb4 does indeed make it work as long as the param bitmapsets
are correct.

I'm still resistant to the idea of kluging EXPLAIN to the extent
of hiding the EXPLAIN output changes. It wouldn't be that hard
to do really, but I worry that such a kluge might hide real problems
in future. So what I did in 0002 was to allow initPlans for an
injected Gather only if debug_parallel_query = on, so that there
will be a place for EXPLAIN to show them. Other than the changes
in that area, 0002 is the same as the previous patch.

regards, tom lane

Attachment Content-Type Size
v2-0001-Account-for-optimized-MinMax-aggregates-during-SS.patch text/x-diff 7.9 KB
v2-0002-Allow-plan-nodes-with-initPlans-to-be-considered-.patch text/x-diff 19.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-07-13 22:12:53 Re: In Postgres 16 BETA, should the ParseNamespaceItem have the same index as it's RangeTableEntry?
Previous Message Pavel Luzanov 2023-07-13 21:23:28 Re: psql: Add role's membership options to the \du+ command