Re: Should we add GUCs to allow partition pruning to be disabled?

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Should we add GUCs to allow partition pruning to be disabled?
Date: 2018-05-07 06:00:59
Message-ID: CAKJS1f-BhJxsj_2Yb2Mj0jB05pDR0YxXKC-rsY7w3jxk3c39Bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Many thanks for reviewing this.

On 2 May 2018 at 20:07, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> + <para>
> + Partition Pruning is also more powerful than constraint exclusion as
> + partition pruning is not something that is performed only during the
> + planning of a given query.
>
> Maybe, don't repeat "partition pruning" again in the same sentence. How
> about:
>
> .. more powerful than constraint exclusion as *it* is not something..

changed.

> Or may suggest to rewrite it as:
>
> Partition pruning is also more powerful than constraint exclusion as it
> can be performed not only during the planning of a given query, but also
> during its execution.
>
> If you accept the above rewrite, the next sentences in the paragraph:
>
> + In certain cases, partition pruning may also
> + be performed during execution of the query as well. This allows pruning
> + to be performed using values which are unknown during query planning, for
> + example, using parameters defined in a <command>PREPARE</command>
> + statement, using a value obtained from a subquery or using parameters
> from
> + a parameterized nested loop join.
>
> could be adjusted a bit to read as:
>
> For example, this allows pruning to be performed using values which are
> unknown during query planning but will be known during execution, such as
> using parameters defined in a <command>PREPARE</command> statement (if a
> generic plan is chosen), or using a value obtained from a subquery, or
> using values from an outer row of a parameterized nested loop join.

I've changed this a bit but I didn't mention generic plans. What you
say is true, but I didn't think we needed to be so specific.

> + <para>
> + The partition pruning which is performed during execution is done so at
> + either one or both of the following times:
>
> done so at -> done at

Changed

> + If partition pruning can be
> + performed here then there is the added benefit of not having to
> + initialize partitions which are pruned. Partitions which are pruned
> + during this stage will not show up in the query's
> + <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>. It
> + is possible to determine the number of partitions which were removed
> + using this method by observing the <quote>Subplans Removed</quote>
> + property in the <command>EXPLAIN</command> output.
>
> While it might be OK to keep the last two sentences, not sure about the
> 1st, which seems like it's spelling out an implementation detail -- that
> there is an initialization step for partitions. It's a nice performance
> enhancement, sure, but might be irrelevant to the users reading this
> documentation.

I've reworded this. I think it's important to inform the reader that
this is performed during initialization of the plan as without that
they might ask why there are two phases of pruning and not just one.
Not having to initialize the subnode for pruned partitions is the sole
advantage of doing this pruning phase, so I would rather be specific
about when it occurs.

> + nested loop joins. Since the value of these parameters may change
> many
> + times during the execution of the query, partition pruning is
> performed
> + whenever one of the execution parameters which is being compared to a
> + partition column or expression changes.
>
> How about writing the last part as: whenever one of the execution
> parameters relevant to pruning changes

I've reworded this.

> + <note>
> + <para>
> + Currently, partition pruning of partitions during the planning of an
> + <command>UPDATE</command> or <command>DELETE</command> command are
> + internally implemented using the constraint exclusion method. Only
> + <command>SELECT</command> uses the faster partition pruning method.
> Also
> + partition pruning performed during execution is only done so for the
> + Append node type. Both of these limitations are likely to be removed
> + in a future release of <productname>PostgreSQL</productname>.
> + </para>
> + </note>
>
> Do we need to write this given that we decided to decouple even the
> UPDATE/DELETE pruning from the constraint_exclusion configuration?

I think it's important to inform people of the limitations. I know
there's a lot of opinions floating around about the usability of
partitioning in PostgreSQL with a large number of partitions. I
included this here so interested parties know that their problems are
not all solved by partition pruning. Perhaps those people can watch
for the removal of this notice.

> Also,
> noting that only Append nodes can use execution-time pruning seems
> unnecessary. I don't see plan node names mentioned like this elsewhere in
> the documentation. But more to the point, it seems like spilling out
> finer implementation details (and/or limitations thereof) in the
> user-facing documentation.

I thought about this while writing the patch, and it forced me to grep
for instances of "Append" in the docs. There were some, so I didn't
think I was breaking any rules. I also have no idea how else we might
explain that it works for Append and not MergeAppend. It's likely
going to be easier to answer possible to future bug reports which
complain run-time pruning is broken with MergeAppend with "It's not a
bug, it's behaving exactly as described in the documents. <link to
docs>".

I have now changed Append for <literal>Append</literal> in the patch
which is aligned to what perform.sgml is doing.

v2 patch is attached.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
further_enable_partition_pruning_doc_updates_v2.patch application/octet-stream 6.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2018-05-07 06:23:07 Re: Having query cache in core
Previous Message David Rowley 2018-05-07 05:05:05 Re: Should we add GUCs to allow partition pruning to be disabled?