Re: [HACKERS] Runtime Partition Pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Beena Emerson <memissemerson(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2017-12-18 10:33:52
Message-ID: CAKJS1f_zp7brpSXYtrPcpdjPsuEePv5jrikFjJVS96ZwgkTkXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18 December 2017 at 21:31, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2017/12/16 15:05, David Rowley wrote:
>> I've been looking over this and I think that the use of the
>> PartitionDispatch in set_append_subplan_indexes is not correct. What
>> we need here is the index of the Append's subnode and that's not what
>> RelationGetPartitionDispatchInfo() gives you. Remember that some
>> partitions could have been pruned away already during planning.
>
> A somewhat similar concern is being discussed on the "UPDATE partition
> key" thread [1]. In that case, ExecInitModifyTable(), when initializing
> tuple routing information to handle the "update partition key" case, will
> have to deal with the fact that there might be fewer sub-plans in the
> ModifyTable node than there are partitions in the partition tree. That
> is, source partitions that planner would have determined after pruning,
> could be fewer than possible target partitions for rows from the source
> partitions to move to, of which the latter consists of *all* partitions.
> So, we have to have a mapping from leaf partition indexes as figured out
> by RelationGetPartitionDispatchInfo() (indexes that are offsets into a
> global array for *all* partitions), to sub-plan indexes which are offsets
> into the array for only those partitions that have a sub-plan. Such
> mapping is built (per the latest patch on that thread) by
> ExecSetupPartitionTupleRouting() in execPartition.c.

Surely this is a different problem? With UPDATE of a partition key, if
the planner eliminates all but 1 partition the UPDATE could cause that
tuple to be "moved" into any leaf partition, very possibly one that's
been eliminated during planning.

In the case of runtime Append pruning, we can forget about all
partitions that the planner managed to eliminate, we'll never need to
touch those, ever. All we care about here is trying to reduce the
number of partitions down further using values that were not available
during planning.

> We could do something similar here using a similar code structure. Maybe,
> add a ExecSetupPartitionRuntimePruning() in execPartition.c (mimicking
> ExecSetupPartitionTupleRouting), that accepts AppendState node.
> Furthermore, it might be a good idea to have something similar to
> ExecFindPartition(), say, ExecGetPartitions(). That is, we have new
> functions for run-time pruning that are counterparts to corresponding
> functions for tuple routing.

Seems to me in this case we're better to build this structure during
planning and save it with the plan so that it can be used over and
over, rather than building it again and again each time the plan is
executed. Likely a common use case for run-time pruning is when the
plan is going to be used multiple times with different parameters, so
we really don't want to repeat any work that we don't have to here.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message PostgreSQL - Hans-Jürgen Schönig 2017-12-18 11:00:38 Re: genomic locus
Previous Message Magnus Hagander 2017-12-18 10:25:25 Re: Small typo in comment in json_agg_transfn