Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "David Rowley *EXTERN*" <dgrowleyml(at)gmail(dot)com>
Subject: Re: generic plans and "initial" pruning
Date: 2022-03-28 07:17:00
Message-ID: CA+HiwqFq9UQiSJbGpJkmd0MkxqQLZWGk1+__tyk0QM+hmYHrrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 22, 2022 at 9:44 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Tue, Mar 15, 2022 at 3:19 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Tue, Mar 15, 2022 at 5:06 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > On Mon, Mar 14, 2022 at 3:38 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > > Also, while I've not spent much time at all reading this patch,
> > > > it seems rather desperately undercommented, and a lot of the
> > > > new names are unintelligible. In particular, I suspect that the
> > > > patch is significantly redesigning when/where run-time pruning
> > > > happens (unless it's just letting that be run twice); but I don't
> > > > see any documentation or name changes suggesting where that
> > > > responsibility is now.
> > >
> > > I am sympathetic to that concern. I spent a while staring at a
> > > baffling comment in 0001 only to discover it had just been moved from
> > > elsewhere. I really don't feel that things in this are as clear as
> > > they could be -- although I hasten to add that I respect the people
> > > who have done work in this area previously and am grateful for what
> > > they did. It's been a huge benefit to the project in spite of the
> > > bumps in the road. Moreover, this isn't the only code in PostgreSQL
> > > that needs improvement, or the worst. That said, I do think there are
> > > problems. I don't yet have a position on whether this patch is making
> > > that better or worse.
> >
> > Okay, I'd like to post a new version with the comments edited to make
> > them a bit more intelligible. I understand that the comments around
> > the new invocation mode(s) of runtime pruning are not as clear as they
> > should be, especially as the changes that this patch wants to make to
> > how things work are not very localized.
>
> Actually, another area where the comments may not be as clear as they
> should have been is the changes that the patch makes to the
> AcquireExecutorLocks() logic that decides which relations are locked
> to safeguard the plan tree for execution, which are those given by
> RTE_RELATION entries in the range table.
>
> Without the patch, they are found by actually scanning the range table.
>
> With the patch, it's the same set of RTEs if the plan doesn't contain
> any pruning nodes, though instead of the range table, what is scanned
> is a bitmapset of their RT indexes that is made available by the
> planner in the form of PlannedStmt.lockrels. When the plan does
> contain a pruning node (PlannedStmt.containsInitialPruning), the
> bitmapset is constructed by calling ExecutorGetLockRels() on the plan
> tree, which walks it to add RT indexes of relations mentioned in the
> Scan nodes, while skipping any nodes that are pruned after performing
> initial pruning steps that may be present in their containing parent
> node's PartitionPruneInfo. Also, the RT indexes of partitioned tables
> that are present in the PartitionPruneInfo itself are also added to
> the set.
>
> While expanding comments added by the patch to make this clear, I
> realized that there are two problems, one of them quite glaring:
>
> * Planner's constructing this bitmapset and its copying along with the
> PlannedStmt is pure overhead in the cases that this patch has nothing
> to do with, which is the kind of thing that Andres cautioned against
> upthread.
>
> * Not all partitioned tables that would have been locked without the
> patch to come up with a Append/MergeAppend plan may be returned by
> ExecutorGetLockRels(). For example, if none of the query's
> runtime-prunable quals were found to match the partition key of an
> intermediate partitioned table and thus that partitioned table not
> included in the PartitionPruneInfo. Or if an Append/MergeAppend
> covering a partition tree doesn't contain any PartitionPruneInfo to
> begin with, in which case, only the leaf partitions and none of
> partitioned parents would be accounted for by the
> ExecutorGetLockRels() logic.
>
> The 1st one seems easy to fix by not inventing PlannedStmt.lockrels
> and just doing what's being done now: scan the range table if
> (!PlannedStmt.containsInitialPruning).

The attached updated patch does it like this.

> The only way perhaps to fix the second one is to reconsider the
> decision we made in the following commit:
>
> commit 52ed730d511b7b1147f2851a7295ef1fb5273776
> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: Sun Oct 7 14:33:17 2018 -0400
>
> Remove some unnecessary fields from Plan trees.
>
> In the wake of commit f2343653f, we no longer need some fields that
> were used before to control executor lock acquisitions:
>
> * PlannedStmt.nonleafResultRelations can go away entirely.
>
> * partitioned_rels can go away from Append, MergeAppend, and ModifyTable.
> However, ModifyTable still needs to know the RT index of the partition
> root table if any, which was formerly kept in the first entry of that
> list. Add a new field "rootRelation" to remember that. rootRelation is
> partly redundant with nominalRelation, in that if it's set it will have
> the same value as nominalRelation. However, the latter field has a
> different purpose so it seems best to keep them distinct.
>
> That is, add back the partitioned_rels field, at least to Append and
> MergeAppend, to store the RT indexes of partitioned tables whose
> children's paths are present in Append/MergeAppend.subpaths.

And implemented this in the attached 0002 that reintroduces
partitioned_rels in Append/MergeAppend nodes as a bitmapset of RT
indexes. The set contains the RT indexes of partitioned ancestors
whose expansion produced the leaf partitions that a given
Append/MergeAppend node scans. This project needs this way of
knowing the partitioned tables involved in producing an
Append/MergeAppend node, because we'd like to give plancache.c the
ability to glean the set of relations to be locked by scanning a plan
tree to make the tree ready for execution rather than by scanning the
range table and the only relations we're missing in the tree right now
are partitioned tables.

One fly-in-the-ointment situation I faced when doing that is the fact
that setrefs.c in most situations removes the Append/MergeAppend from
the final plan if it contains only one child subplan. I got around it
by inventing a PlannerGlobal/PlannedStmt.elidedAppendPartedRels set
which is a union of partitioned_rels of all the Append/MergeAppend
nodes in the plan tree that were removed as described.

Other than the changes mentioned above, the updated patch now contains
a bit more commentary than earlier versions, mostly around
AcquireExecutorLocks()'s new way of determining the set of relations
to lock and the significantly redesigned working of the "initial"
execution pruning.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v6-0003-Add-a-plan_tree_walker.patch application/x-patch 3.9 KB
v6-0002-Add-Merge-Append.partitioned_rels.patch application/x-patch 17.4 KB
v6-0004-Optimize-AcquireExecutorLocks-to-skip-pruned-part.patch application/x-patch 94.2 KB
v6-0001-Some-refactoring-of-runtime-pruning-code.patch application/x-patch 26.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-03-28 07:20:07 Re: Allow file inclusion in pg_hba and pg_ident files
Previous Message Dongming Liu 2022-03-28 07:13:46 Re: DSA failed to allocate memory