Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Amit Langote <amitlangote09(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-04-03 12:02:50
Message-ID: 68087b42-ccfa-deab-a146-f8da73d9e3c5@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi David.

On 2018/04/02 21:03, David Rowley wrote:
> On 2 April 2018 at 17:18, Amit Langote wrote:
>> On 2018/03/31 0:55, David Rowley wrote:
>>> explain (analyze, costs off, summary off, timing off) execute q1 (1,2,2,0);
>>> QUERY PLAN
>>> --------------------------------
>>> Result (actual rows=0 loops=1)
>>> One-Time Filter: false
>>> (2 rows)
>>
>> Hmm. It is the newly added inversion step that's causing this. When
>> creating a generic plan (that is when the planning happens via
>> BuildCachedPlan called with boundParams set to NULL), the presence of
>> Params will cause an inversion step's source step to produce
>> scan-all-partitions sort of result, which the inversion step dutifully
>> inverts to a scan-no-partitions result.
>>
>> I have tried to attack that problem by handling the
>> no-values-to-prune-with case using a side-channel to propagate the
>> scan-all-partitions result through possibly multiple steps. That is, a
>> base pruning step will set datum_offsets in a PruneStepResult only if
>> pruning is carried out by actually comparing values with the partition
>> bounds. If no values were provided (like in the generic plan case), it
>> will set a scan_all_nonnull flag instead and return without setting
>> datum_offsets. Combine steps perform their combining duty only if
>> datum_offset contains a valid value, that is, if scan_all_nonnulls is not set.
>
> I'm afraid this is still not correct :-(
>
> The following code is not doing the right thing:
>
> + case COMBINE_UNION:
> + foreach(lc1, cstep->source_stepids)
> + {
> + int step_id = lfirst_int(lc1);
> + PruneStepResult *step_result;
> +
> + /*
> + * step_results[step_id] must contain a valid result,
> + * which is confirmed by the fact that cstep's step_id is
> + * greater than step_id and the fact that results of the
> + * individual steps are evaluated in sequence of their
> + * step_ids.
> + */
> + if (step_id >= cstep->step.step_id)
> + elog(ERROR, "invalid pruning combine step argument");
> + step_result = step_results[step_id];
> + Assert(step_result != NULL);
> +
> + result->scan_all_nonnull = step_result->scan_all_nonnull;
>
> The last line there is not properly performing a union, it just sets
> the result_scan_all_nonnull to whatever the last step's value was.>
> At the very least it should be |= but I don't really like this new code.
>
> Why did you move away from just storing the matching partitions in a
> Bitmapset? If you want to store all non-null partitions, then why not
> just set the bits for all non-null partitions? That would cut down on
> bugs like this since the combining of step results would just be
> simple unions or intersects.

As I mentioned in my previous email, I had to find a side-channel (that is
scan_all_nonnull) to store this information instead of doing it the
regular way, to differentiate the case where we need to scan all
partitions because of values in the base prune steps not being available
from the case where carrying out a step using actual values ends up
selecting all partitions. When creating a generic plan, values of none of
the Params that are added to base prune steps are available and that
results in reaching the actual pruning functions
(get_matching_hash/list/range_bounds) without any values, which results in
each of those functions, in returning all partitions containing non-null data.

But actually, the presence of only Params in the pruning steps should
result in the pruning not being invoked at all (at least for the static
pruning case), thus selecting all partitions containing non-null data. It
is better to implement that instead of a workaround like scan_all_nonnulls
side-channel I was talking about.

Fixed the patch to implement it that way.

> Also, the following code could be made a bit nicer
>
> + result = (PruneStepResult *) palloc0(sizeof(PruneStepResult));
> +
> + switch (context->strategy)
> + {
> + case PARTITION_STRATEGY_HASH:
> + result->bound_offsets = get_matching_hash_bound(context,
> + opstep->opstrategy,
> + values, nvalues,
> + partsupfunc,
> + opstep->nullkeys,
> + &result->scan_all_nonnull);
>
> Why not allocate the PruneStepResult inside the get_matching_*_bound,
> that way you wouldn't need all those out parameters to set the bool
> fields.

I thought it'd be nice to have perform_pruning_base_step generate the
actual PruneStepResult instead of the functions for individual
partitioning strategies, which in a way minimizes places where it is
manipulated. Since, we've divided bound searching into 3 separate
functions anyway, it also seemed better to me to have their signatures be
relevant to the partition strategy they cater to. The function for hash
partitioning, for example, never has to deal with setting the result for
the null or the default partition and the range partitioning function
doesn't have to worry about doing anything about for null partition.
Also, overall footprint of those 3 functions reduced because they don't
have to create the PruneStepResult themselves.

Attached v47.

Thanks,
Amit

Attachment Content-Type Size
v47-0001-Add-partsupfunc-to-PartitionSchemeData.patch text/plain 3.4 KB
v47-0002-Add-more-tests-for-partition-pruning.patch text/plain 16.7 KB
v47-0003-Faster-partition-pruning.patch text/plain 127.4 KB
v47-0004-Add-only-unpruned-partitioned-child-rels-to-part.patch text/plain 24.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-04-03 12:05:04 Re: Online enabling of checksums
Previous Message Magnus Hagander 2018-04-03 11:59:07 Re: [PATCH] Verify Checksums during Basebackups