Re: generic plans and "initial" pruning

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: generic plans and "initial" pruning
Date: 2022-04-01 01:31:54
Message-ID: CAApHDvp_DjVVkgSV24+UF7p_yKWeepgoo+W2SWLLhNmjwHTVYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 31 Mar 2022 at 16:25, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Rebased.

I've been looking over the v8 patch and I'd like to propose semi-baked
ideas to improve things. I'd need to go and write them myself to
fully know if they'd actually work ok.

1. You've changed the signature of various functions by adding
ExecLockRelsInfo *execlockrelsinfo. I'm wondering why you didn't just
put the ExecLockRelsInfo as a new field in PlannedStmt?

I think the above gets around messing the signatures of
CreateQueryDesc(), ExplainOnePlan(), pg_plan_queries(),
PortalDefineQuery(), ProcessQuery() It would get rid of your change of
foreach to forboth in execute_sql_string() / PortalRunMulti() and gets
rid of a number of places where your carrying around a variable named
execlockrelsinfo_list. It would also make the patch significantly
easier to review as you'd be touching far fewer files.

2. I don't really like the way you've gone about most of the patch...

The way I imagine this working is that during create_plan() we visit
all nodes that have run-time pruning then inside create_append_plan()
and create_merge_append_plan() we'd tag those onto a new field in
PlannerGlobal That way you can store the PartitionPruneInfos in the
new PlannedStmt field in standard_planner() after the
makeNode(PlannedStmt).

Instead of storing the PartitionPruneInfo in the Append / MergeAppend
struct, you'd just add a new index field to those structs. The index
would start with 0 for the 0th PartitionPruneInfo. You'd basically
just know the index by assigning
list_length(root->glob->partitionpruneinfos).

You'd then assign the root->glob->partitionpruneinfos to
PlannedStmt.partitionpruneinfos and anytime you needed to do run-time
pruning during execution, you'd need to use the Append / MergeAppend's
partition_prune_info_idx to lookup the PartitionPruneInfo in some new
field you add to EState to store those. You'd leave that index as -1
if there's no PartitionPruneInfo for the Append / MergeAppend node.

When you do AcquireExecutorLocks(), you'd iterate over the
PlannedStmt's PartitionPruneInfo to figure out which subplans to
prune. You'd then have an array sized
list_length(plannedstmt->runtimepruneinfos) where you'd store the
result. When the Append/MergeAppend node starts up you just check if
the part_prune_info_idx >= 0 and if there's a non-NULL result stored
then use that result. That's how you'd ensure you always got the same
run-time prune result between locking and plan startup.

3. Also, looking at ExecGetLockRels(), shouldn't it be the planner's
job to determine the minimum set of relations which must be locked? I
think the plan tree traversal during execution not great. Seems the
whole point of this patch is to reduce overhead during execution. A
full additional plan traversal aside from the 3 that we already do for
start/run/end of execution seems not great.

I think this means that during AcquireExecutorLocks() you'd start with
the minimum set or RTEs that need to be locked as determined during
create_plan() and stored in some Bitmapset field in PlannedStmt. This
minimal set would also only exclude RTIs that would only possibly be
used due to a PartitionPruneInfo with initial pruning steps, i.e.
include RTIs from PartitionPruneInfo with no init pruining steps (you
can't skip any locks for those). All you need to do to determine the
RTEs to lock are to take the minimal set and execute each
PartitionPruneInfo in the PlannedStmt that has init steps

4. It's a bit disappointing to see RelOptInfo.partitioned_rels getting
revived here. Why don't you just add a partitioned_relids to
PartitionPruneInfo and just have make_partitionedrel_pruneinfo build
you a Relids of them. PartitionedRelPruneInfo already has an rtindex
field, so you just need to bms_add_member whatever that rtindex is.

It's a fairly high-level review at this stage. I can look in more
detail if the above points get looked at. You may find or know of
some reason why it can't be done like I mention above.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-04-01 02:00:35 Re: Logical replication timeout problem
Previous Message Kyotaro Horiguchi 2022-04-01 01:31:10 Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages