Re: [HACKERS] Runtime Partition Pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2018-04-06 10:46:34
Message-ID: CAKJS1f-aGOQ0=_UrqhbSDQPb-V2THtpyHQS1PZtXSYgY8CoJkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6 April 2018 at 00:54, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> * Regarding PartitionPruneInfo:
>
> I think the names of the fields could be improved -- like pruning_steps
> instead prunesteps, unpruned_parts instead of allpartindexs. The latter
> is even a bit misleading because it doesn't in fact contain *all*
> partition indexes, only those that are unpruned, because each either has a
> subpath or it appears in (unpruned) partitioned_rels list. Also, I didn't
> like the name subnodeindex and subpartindex very much. subplan_indexes
> and parent_indexes would sound more informative to me.

I've done a load of renaming. I went with subnode_map and subpart_map
instead of _indexes. I thought this was better. Let me know if you
disagree.

> * make_partition_pruneinfo has a parameter resultRelations that's not used
> anywhere

I've taken this out of 0001 and it is now introduced again in 0005,
where it's used.

> * Also in make_partition_pruneinfo()
>
> /* Initialize to -1 to indicate the rel was not found */
> for (i = 0; i < root->simple_rel_array_size; i++)
> {
> allsubnodeindex[i] = -1;
> allsubpartindex[i] = -1;
> }
>
> Maybe, allocate the arrays above mentioned using palloc0 and don't do this
> initialization. Instead make the indexes that are stored in these start
> with 1 and consider 0 as invalid entries.

I've made this change.

> * Regarding the code added in execPartition.c and execPartition.h:
>
> I wondered why PartitionedRelPruning is named the way it is. I saw many
> parallels with PartitionDispatchData both in terms of the main thing it
> consists of, that is, the map that translates partition indexes as in
> partition descriptor to that of subplans or of some other executor
> structure. Also, I imagine you tried to mimic PartitionTupleRouting with
> PartitionPruning but not throughout. For example, tuple routing struct
> pointer variables are throughout called proute, whereas PartitionPruning
> ones are called partprune instead of, say, pprune. Consistency would
> help, imho.

I've made a series of changes here too, but didn't use the word
"Dispatch" anywhere. I'm not really sure what the origin of this word
is. To me, it means to send something somewhere, which I thought might
be why we see it tuple routing, since the tuple is being "dispatched"
to the correct partition.

We're not dispatching anything anywhere in partition pruning, so don't
really think the term can be used here. Although, if you see some
other reason for using that word, please explain.

> * Instead of nesting PartitionedRelPruning inside another, just store them
> in a global flat array in the PartitionPruning, like PartitionTupleRouting
> does for PartitionDispatch of individual partitioned tables in the tree.
>
> typedef struct PartitionedRelPruning
> {
> int nparts;
> int *subnodeindex;
> struct PartitionedRelPruning **subpartprune;

I've kept the same subpart_map from the PartitionPruneInfo. I actually
ended up using that one without performing a memcpy() on it, since
we're not changing it anywhere. That might or might not be a good idea
since way might day think we can change it which would alter the
plan's copy, but on the other hand, there's a little performance boost
in not performing a copy.

> * I don't see why there needs to be nparts in the above, because it
> already has a PartitionPruneContext member which has that information.

I've removed that field.

I've also been doing a bit of work on the 0005 patch:

1. It now properly supports skipping subplans when exec params can
eliminate certain partitions. Previously I'd only coded it to work
with external params.
2. Fixed bug where statement level triggers would not fire when all
partitions were pruned.
3. Added tests

The patch is still based on v50 of the runtime pruning patch [1]

[1] https://www.postgresql.org/message-id/77a518ac-e4a0-4cd1-9988-e5d754a6501f%40lab.ntt.co.jp

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
v21-0001-Provide-infrastructure-to-allow-partition-prunin.patch application/octet-stream 40.6 KB
v21-0002-Add-bms_prev_member-function.patch application/octet-stream 5.3 KB
v21-0003-Allow-unneeded-Append-subnodes-to-be-pruned-at-e.patch application/octet-stream 87.1 KB
v21-0004-Allow-unneeded-MergeAppend-s-subnodes-to-be-prun.patch application/octet-stream 18.4 KB
v21-0005-Improve-planning-speed-of-partitioned-table-UPDA.patch application/octet-stream 39.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-04-06 11:06:42 Re: ERROR: invalid memory alloc request size 1073741824
Previous Message John Naylor 2018-04-06 10:17:51 Re: WIP: a way forward on bootstrap data