Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-01-26 09:17:41
Message-ID: b307cf01-9083-2896-951e-402fbe0ae256@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi David.

I looked at both of your faster_partition_prune_v21_delta_drowley_v1.patch
and faster_partition_prune_v21_delta_drowley_v1_delta.patch and have
incorporated them into the main patch series to get us the attached v22
set. I like the result very much. Thank you!

On 2018/01/23 15:44, David Rowley wrote:
> Attached is what I had in mind about how to do this. Only the planner
> will need to call populate_partition_clause_info. The planner and
> executor can call get_partitions_from_clauseinfo. I'll just need to
> change the run-time prune patch to pass the PartitionClauseInfo into
> the executor in the Append node.

I liked the division into those two functions, although not quite the
embedding of "info" into the function names. I think it's good to just
call them populate_partition_clauses and get_partitions_from_clauses. It
seems OK though for their arguments to contain "info" in their names.

> I've also added the context struct that I mentioned above. It's
> currently not carrying much weight, but the idea is that this context
> will be passed around a bit more with the run-time pruning patch and
> it will also carry the details about parameter values. I'll need to
> modify a few signatures of functions like partkey_datum_from_expr to
> pass the context there too. I didn't do that here because currently,
> those functions have no use for the context with the fields that they
> currently have.

The context struct seems good to me too.

> I've also fixed a bug where when you built the commutator OpExpr in
> what's now called extract_partition_key_clauses the inputcollid was
> not being properly set. The changes I made there go much further than
> just that, I've completely removed the OpExpr from the PartClause
> struct as only two members were ever used. I thought it was better
> just to add those to PartClause instead.

I wondered if we should rename that to something like
PartClauseProperties, but maybe that's too long.

> I also did some renaming of variables that always assumed that the
> Expr being compared to the partition key was a constant. This is true
> now, but with run-time pruning patch, it won't be, so I thought it was
> better to do this here rather than having to rename them in the
> run-time pruning patch.

OK, seems fine.

> One thing I don't yet understand about the patch is the use of
> predicate_refuted_by() in get_partitions_from_or_clause_args(). I did
> adjust the comment above that code, but I'm still not certain I fully
> understand why that function has to be used instead of building the
> clauses for the OR being processed along with the remaining clauses.
> Is it that this was too hard to extract that you ended up using
> predicate_refuted_by()?

I have tried to explain that better in the updated comment in the new
patch, along with some code rearrangement to better make sense of what's
going on. Let me just copy-paste the new comment I wrote. I have tried
to rethink the solution a number of times but never came up with a
sensible alternative.

/*
* When matching an OR expression, it is only checked if at least one of
* its args matches the partition key, not all. For arguments that don't
* match, we cannot eliminate any of its partitions using
* get_partitions_from_clauses(). However, if the table is itself a
* partition, we may be able to prove using constraint exclusion that the
* clause refutes its partition constraint, that is, we can eliminate all
* of its partitions.
*/
foreach(lc, or_clause_args)
{

> I've also removed the makeBoolExpr call that you were encapsulating
> the or_clauses in. I didn't really see the need for this since you
> just removed it again when looping over the or_clauses.

Ah, OK. I first became concerned that you said you were adding arguments
of different OR expressions into a single list and call it or_clauses, but
calmed down after checking that that's not the case. :)

> The only other changes are just streamlining code and comment changes.

I made a few of those myself in the updated patches.

Thanks a lot again for your work on this.

Regards,
Amit

Attachment Content-Type Size
v22-0001-Some-interface-changes-for-partition_bound_-cmp-.patch text/plain 11.7 KB
v22-0002-Introduce-a-get_partitions_from_clauses.patch text/plain 70.5 KB
v22-0003-Move-some-code-of-set_append_rel_size-to-separat.patch text/plain 8.6 KB
v22-0004-More-refactoring-around-partitioned-table-Append.patch text/plain 13.4 KB
v22-0005-Teach-planner-to-use-get_partitions_from_clauses.patch text/plain 34.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2018-01-26 09:56:58 Re: \describe*
Previous Message Andres Freund 2018-01-26 08:23:41 Re: JIT compiling with LLVM v9.0