Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "David Rowley *EXTERN*" <dgrowleyml(at)gmail(dot)com>
Subject: Re: generic plans and "initial" pruning
Date: 2022-04-04 12:55:54
Message-ID: CA+HiwqH9R8tPwbxOd_uxTjjtHpd-yTU4eQW7ndDVWrbah_ACEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review.

On Sun, Apr 3, 2022 at 8:33 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> I noticed a definitional problem in 0001 that's also a bug in some
> conditions -- namely that the bitmapset "validplans" is never explicitly
> initialized to NIL. In the original coding, the BMS was always returned
> from somewhere; in the new code, it is passed from an uninitialized
> stack variable into the new ExecInitPartitionPruning function, which
> then proceeds to add new members to it without initializing it first.

Hmm, the following blocks in ExecInitPartitionPruning() define
*initially_valid_subplans:

/*
* Perform an initial partition prune pass, if required.
*/
if (prunestate->do_initial_prune)
{
/* Determine which subplans survive initial pruning */
*initially_valid_subplans = ExecFindInitialMatchingSubPlans(prunestate);
}
else
{
/* We'll need to initialize all subplans */
Assert(n_total_subplans > 0);
*initially_valid_subplans = bms_add_range(NULL, 0,
n_total_subplans - 1);
}

AFAICS, both assign *initially_valid_subplans a value whose
computation is not dependent on reading it first, so I don't see a
problem.

Am I missing something?

> Indeed that function's header comment explicitly indicates that it is
> not initialized:
>
> + * Initial pruning can be done immediately, so it is done here if needed and
> + * the set of surviving partition subplans' indexes are added to the output
> + * parameter *initially_valid_subplans.
>
> even though this is not fully correct, because when prunestate->do_initial_prune
> is false, then the BMS *is* initialized.
>
> I have no opinion on where to initialize it, but it needs to be done
> somewhere and the comment needs to agree.

I can see that the comment is insufficient, so I've expanded it as follows:

- * Initial pruning can be done immediately, so it is done here if needed and
- * the set of surviving partition subplans' indexes are added to the output
- * parameter *initially_valid_subplans.
+ * On return, *initially_valid_subplans is assigned the set of indexes of
+ * child subplans that must be initialized along with the parent plan node.
+ * Initial pruning is performed here if needed and in that case only the
+ * surviving subplans' indexes are added.

> I think the names ExecCreatePartitionPruneState and
> ExecInitPartitionPruning are too confusingly similar. Maybe the former
> should be renamed to somehow make it clear that it is a subroutine for
> the former.

Ah, yes. I've taken out the "Exec" from the former.

> At the top of the file, there's a new comment that reads:
>
> * ExecInitPartitionPruning:
> * Creates the PartitionPruneState required by each of the two pruning
> * functions.
>
> What are "the two pruning functions"? I think here you mean "Append"
> and "MergeAppend". Maybe spell that out explicitly.

Actually it meant: ExecFindInitiaMatchingSubPlans() and
ExecFindMatchingSubPlans(). They perform "initial" and "exec" set of
pruning steps, respectively.

I realized that both functions have identical bodies at this point,
except that they pass 'true' and 'false', respectively, for
initial_prune argument of the sub-routine
find_matching_subplans_recurse(), which is where the pruning using the
appropriate set of steps contained in PartitionPruneState
(initial_pruning_steps or exec_pruning_steps) actually occurs. So,
I've updated the patch to just retain the latter, adding an
initial_prune parameter to it to pass to the aforementioned
find_matching_subplans_recurse().

I've also updated the run-time pruning module comment to describe this change:

* ExecFindMatchingSubPlans:
- * Returns indexes of matching subplans after evaluating all available
- * expressions, that is, using execution pruning steps. This function can
- * can only be called during execution and must be called again each time
- * the value of a Param listed in PartitionPruneState's 'execparamids'
- * changes.
+ * Returns indexes of matching subplans after evaluating the expressions
+ * that are safe to evaluate at a given point. This function is first
+ * called during ExecInitPartitionPruning() to find the initially
+ * matching subplans based on performing the initial pruning steps and
+ * then must be called again each time the value of a Param listed in
+ * PartitionPruneState's 'execparamids' changes.

> I think this comment needs to be reworded:
>
> + * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
> + * changed to index range 0..num(initially_valid_subplans).

Assuming you meant to ask to write this without the odd notation, I've
expanded the comment as follows:

- * Subplans would previously be indexed 0..(n_total_subplans - 1) should be
- * changed to index range 0..num(initially_valid_subplans).
+ * Current values of the indexes present in PartitionPruneState count all the
+ * subplans that would be present before initial pruning was done. If initial
+ * pruning got rid of some of the subplans, any subsequent pruning passes will
+ * will be looking at a different set of target subplans to choose from than
+ * those in the pre-initial-pruning set, so the maps in PartitionPruneState
+ * containing those indexes must be updated to reflect the new indexes of
+ * subplans in the post-initial-pruning set.

I've attached only the updated 0001, though I'm still working on the
others to address David's comments.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v9-0001-Some-refactoring-of-runtime-pruning-code.patch application/octet-stream 31.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-04-04 12:59:28 Re: CLUSTER on partitioned index
Previous Message Daniel Gustafsson 2022-04-04 12:29:40 Re: Reword docs of feature "Remove temporary files after backend crash"