Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS] path toward faster partition pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS] path toward faster partition pruning
Date: 2018-01-17 04:05:44
Message-ID: CAKJS1f-FNzwx+R7Epu+2p3-NC5Zmkz-ss7=QvBXyS4s6URLAjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 January 2018 at 21:08, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Attached v20. Thanks again.

Thanks for working on v20. I've had a look over part of it and I've
written down the following:

1. The following comment is not correct

/*
* Equality look up key. Values in the following array appear in no
* particular order (unlike minkeys and maxkeys below which must appear in
* the same order as the partition key columns).

These must be in partition key order, just like the others.

This part is not true either:

* the same order as the partition key columns). n_eqkeys must be equal to
* the number of partition keys to be valid (except in the case of hash
* partitioning where that's not required). When set, minkeys and maxkeys
* are ignored.

range2 is pruned just fine from the following:

create table rangep (a int, b int) partition by range (a,b);
create table rangep1 partition of rangep for values from (1,10) to (1,20);
create table rangep2 partition of rangep for values from (2,10) to (2,20);

explain select * from rangep where a = 1;
QUERY PLAN
---------------------------------------------------------------
Append (cost=0.00..38.25 rows=11 width=8)
-> Seq Scan on rangep1 (cost=0.00..38.25 rows=11 width=8)
Filter: (a = 1)
(3 rows)

2. You've added a list_copy() to get_partitions_from_clauses so as not
to modify the input list, but this function calls
get_partitions_from_clauses_recurse which calls
classify_partition_bounding_keys() which modifes that list. Would it
not just be better to make a list copy in
get_partitions_from_clauses() without any conditions?

If we get new users of that function, e.g Run-time pruning, then they
might be surprised to see new items magically added to their input
list without mention of that behaviour in the function comment.

3. The following case causes an Assert failure:

drop table listp;
CREATE TABLE listp (a int, b int) partition by list (a);
create table listp1 partition of listp for values in (1);
create table listp2 partition of listp for values in (2);

prepare q1 (int) as select * from listp where a = 1 and a in($1,10);

explain execute q1 (1);
explain execute q1 (1);
explain execute q1 (1);
explain execute q1 (1);
explain execute q1 (1);
explain execute q1 (1); -- <--- Assert failure!

In match_clauses_to_partkey you always add the ScalarArrayOpExpr to
the result regardless if it is a complete set of Consts, however, the
code in classify_partition_bounding_keys() that deals with
ScalarArrayOpExpr in can't handle non-consts

/* OK to add to the result. */
result = lappend(result, clause);
if (IsA(eval_const_expressions(root, rightop), Const))
*contains_const = true;
else
*contains_const = false;

*contains_consts is reset to true again by the a = 1 qual, so
get_partitions_from_clauses() gets called from
get_append_rel_partitions. Later classify_partition_bounding_keys()
when processing the ScalarArrayOpExpr, the following code assumes the
array exprs are all Consts:

foreach(lc1, elem_exprs)
{
Const *rightop = castNode(Const, lfirst(lc1));

Setting *contains_const = false; in match_clauses_to_partkey() is not
correct either. If I understand the intent here correctly, you want
this to be set to true if the clause list contains quals with any
consts that are useful for partition pruning during planning. If
that's the case then you should set it to true if you find a suitable
clause, otherwise leave it set to false as you set it to at the start
of the function. What you have now will have varying results based on
the order of the clauses in the list, which is certainly not correct.

4. The following code can be rearranged to not pull_varnos if there's
no commutator op.

constexpr = leftop;
constrelids = pull_varnos((Node *) leftop);
expr_op = get_commutator(expr_op);

/*
* If no commutator exists, cannot flip the qual's args,
* so give up.
*/
if (!OidIsValid(expr_op))
continue;

5. Header comment for match_clauses_to_partkey() says only clauses in
the pattern of "partkey op const" and "const op partkey" are handled.
NULL tests are also mentioned but nothing is mentioned about
ScalarArrayOpExpr. It might be better to be less verbose about what
the function handles, but if you're listing what is handled then you
should not make false claims.

* For an individual clause to match with a partition key column, the clause
* must be an operator clause of the form (partkey op const) or (const op
* partkey); the latter only if a suitable commutator exists. Furthermore,

6. Which brings me to; why do we need match_clauses_to_partkey at all?
classify_partition_bounding_keys seems to do all the work
match_clauses_to_partkey does, plus more. Item #3 above is caused by
an inconsistency between these functions. What benefit does
match_clauses_to_partkey give? I might understand if you were creating
list of clauses matching each partition key, but you're just dumping
everything in one big list which causes
classify_partition_bounding_keys() to have to match each clause to a
partition key again, and classify_partition_bounding_keys is even
coded to ignore clauses that don't' match any key, so it makes me
wonder what is match_clauses_to_partkey actually for?

I'm going to stop reviewing there as if you remove
match_clauses_to_partkey is going to cause churn that'll need to be
reviewed again.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-01-17 04:09:52 Re: TOAST table created for partitioned tables
Previous Message David Rowley 2018-01-17 03:32:16 Re: [Sender Address Forgery]Re: [Sender Address Forgery]Re: [HACKERS] path toward faster partition pruning