Re: [HACKERS] Runtime Partition Pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2018-04-06 02:39:07
Message-ID: d0c21e44-6e74-6093-7481-1757b09a48b4@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi David.

On 2018/04/05 22:41, David Rowley wrote:
>> * make_partition_pruneinfo has a parameter resultRelations that's not used
>> anywhere
>
> It gets used in 0005.
>
> I guess I could only add it in 0005, but make_partition_pruneinfo is
> only used in 0003, so you could say the same about that entire
> function.
>
> Do you think I should delay adding that parameter until the 0005 patch?

Yes, I think.

>> * In make_partition_pruneinfo()
>>
>> allsubnodeindex = palloc(sizeof(int) * root->simple_rel_array_size);
>> allsubpartindex = palloc(sizeof(int) * root->simple_rel_array_size);
>>
>> I think these arrays need to have root->simple_rel_array_size + 1
>> elements, because they're subscripted using RT indexes which start at 1.
>
> RT indexes are always 1-based. See setup_simple_rel_arrays. It already
> sets the array size to list_length(rtable) + 1.

Oh, I missed that simple_rel_array_size itself is set to consider 1-based
RT indexes.

relnode.c:73
root->simple_rel_array_size = list_length(root->parse->rtable) + 1;

>> * Also in make_partition_pruneinfo()
>>
>> /* Initialize to -1 to indicate the rel was not found */
>> for (i = 0; i < root->simple_rel_array_size; i++)
>> {
>> allsubnodeindex[i] = -1;
>> allsubpartindex[i] = -1;
>> }
>>
>> Maybe, allocate the arrays above mentioned using palloc0 and don't do this
>> initialization. Instead make the indexes that are stored in these start
>> with 1 and consider 0 as invalid entries.
>
> 0 is a valid subplan index. It is possible to make this happen, but
> I'd need to subtract 1 everywhere I used the map. That does not seem
> very nice. Seems more likely to result in bugs where we might forget
> to do the - 1.

You can subtract 1 right here in make_partition_pruneinfo before setting
the values in PartitionPruneInfo's subplan_indexes and parent_indexes.
I'm only proposing to make make_partition_pruneinfo() a bit faster by not
looping over both the local map arrays setting them to -1.

IOW, I'm not saying that we emit PartitionPruneInfo nodes that contain
1-based indexes.

> Did you want this because you'd rather have the palloc0() than the for
> loop setting the array elements to -1? Or is there another reason?

Yeah, that's it.

>> * Instead of nesting PartitionedRelPruning inside another, just store them
>> in a global flat array in the PartitionPruning, like PartitionTupleRouting
>> does for PartitionDispatch of individual partitioned tables in the tree.
>>
>> typedef struct PartitionedRelPruning
>> {
>> int nparts;
>> int *subnodeindex;
>> struct PartitionedRelPruning **subpartprune;
>
> There is a flat array in PartitionPruning. subpartprune contains
> pointers into that array. I want to have this pointer array so I can
> directly reference the flat array in PartitionPruning.
>
> Maybe I've misunderstood what you mean here.

I think we can save some space here by not having the pointers stored
here. Instead of passing the pointer itself in the recursive calls to
find_subplans_for_extparams_recurse, et al, just pass the entire array and
an offset to use for the given recursive invocation.

If you look at ExecFindPartition used for tuple routing, you may see that
there no recursion at all. Similarly find_subplans_for_extparams_recurse,
et al might be able to avoid recursion if similar tricks are used.

Finally about having two different functions for different sets of Params:
can't we have just named find_subplans_for_params_recurse() and use the
appropriate one based on the value of some parameter? I can't help but
notice the duplication of code.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-04-06 02:47:05 Re: Fix for pg_stat_activity putting client hostaddr into appname field
Previous Message Amit Langote 2018-04-06 02:03:42 Re: [HACKERS] path toward faster partition pruning