Re: path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: path toward faster partition pruning
Date: 2017-09-25 10:04:36
Message-ID: b49a9246-aa56-d429-a241-2fe27564f7d7@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Dilip.

Thanks for looking at the patches and the comments.

On 2017/09/16 18:43, Dilip Kumar wrote:
> On Fri, Sep 15, 2017 at 2:20 PM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2017/09/15 11:16, Amit Langote wrote:
>
> Thanks for the updated patch. I was going through the logic of
> get_rel_partitions in 0002 as almost similar functionality will be
> required by runtime partition pruning on which Beena is working. The
> only difference is that here we are processing the
> "rel->baserestrictinfo" and in case of runtime pruning, we also need
> to process join clauses which are pushed down to appendrel.

Yeah, I agree with the point you seem to be making that
get_rel_partitions() covers a lot of functionality, which it would be nice
to break down into reusable function(s) with suitable signature(s) that
the executor will also be able to use.

Your proposed refactoring patch down-thread seems to be a good step in
that direction. Thanks for working on it.

> So can we make some generic logic which can be used for both the patches.
>
> So basically, we need to do two changes
>
> 1. In get_rel_partitions instead of processing the
> "rel->baserestrictinfo" we can take clause list as input that way we
> can pass any clause list to this function.
>
> 2. Don't call "get_partitions_for_keys" routine from the
> "get_rel_partitions", instead, get_rel_partitions can just prepare
> minkey, maxkey and the caller of the get_rel_partitions can call
> get_partitions_for_keys, because for runtime pruning we need to call
> get_partitions_for_keys at runtime.

It's not clear to me whether get_rel_partitions() itself, as it is, is
callable from outside the planner, because its signature contains
RelOptInfo. We have the RelOptInfo in the signature, because we want to
mark certain fields in it so that latter planning steps can use them. So,
get_rel_partitions()'s job is not just to match clauses and find
partitions, but also to perform certain planner-specific tasks of
generating information that the later planning steps will want to use.
That may turn out to be unnecessary, but until we know that, let's not try
to export get_rel_partitions() itself out of the planner.

OTOH, the function that your refactoring patch separates out to match
clauses to partition keys and extract bounding values seems reusable
outside the planner and we should export it in such a way that it can be
used in the executor. Then, the hypothetical executor function that does
the pruning will first call the planner's clause-matching function,
followed by calling get_partitions_for_keys() in partition.c to get the
selected partitions.

We should be careful when designing the interface of the exported function
to make sure it's not bound to the planner. Your patch still maintains
the RelOptInfo in the signature of the clause-matching function, which the
executor pruning function won't have access to.

> After these changes also there will be one problem that the
> get_partitions_for_keys is directly fetching the "rightop->constvalue"
> whereas, for runtime pruning, we need to store rightop itself and
> calculate the value at runtime by param evaluation, I haven't yet
> thought how can we make this last part generic.

I don't think any code introduced by the patch in partition.c itself looks
inside OpExpr (or any type of clause for that matter). That is, I don't
see where get_partitions_for_keys() is looking at rightop->constvalue.
All it receives to work with are arrays of Datums and some other relevant
information like inclusivity, nullness, etc.

By the way, I'm now rebasing these patches on top of [1] and will try to
merge your refactoring patch in some appropriate way. Will post more
tomorrow.

Thanks,
Amit

[1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9140cf826

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2017-09-25 10:20:07 Re: GUC for cleanup indexes threshold.
Previous Message Michael Paquier 2017-09-25 09:37:20 Re: Shaky coding for vacuuming partitioned relations