Re: path toward faster partition pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: path toward faster partition pruning
Date: 2017-11-08 04:44:59
Views: Raw Message | Whole Thread | Download mbox
Lists: pgsql-hackers

On 7 November 2017 at 01:52, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Thanks. I'll look over it all again starting my Tuesday morning. (UTC+13)

Hi Amit,

I had another look over this today. Apologies if any of the review seems petty.

Here goes:

1. If test seems to be testing for a child that's a partitioned table,
but is testing for a non-NULL part_scheme.

* If childrel is itself partitioned, add it and its partitioned
* children to the list being propagated up to the root rel.
if (childrel->part_scheme && rel->part_scheme)

Should this code use IS_PARTITIONED_REL() instead? Seems a bit strange
to test for a NULL part_scheme

2. There's a couple of mistakes in my bms_add_range() code. I've
attached bms_add_range_fix.patch. Can you apply this to your tree?

3. This assert seems to be Asserting the same thing twice:

Assert(rel->live_partitioned_rels != NIL &&
list_length(rel->live_partitioned_rels) > 0);

A List with length == 0 is always NIL.

4. get_partitions_from_clauses(), can you comment why you perform the
list_concat() there.

I believe this is there so that the partition bound from the parent is
passed down to the child so that we can properly eliminate all child
partitions when the 2nd level of partitioning is using the same
partition key as the 1st level. I think this deserves a paragraph of
comment to explain this.

5. Please add a comment to explain what's going on here in

if (partattno == 0)
partexpr = lfirst(partexprs_item);
partexprs_item = lnext(partexprs_item);

Looks like, similar to index expressions, that partition expressions
are attno 0 to mark to consume the next expression from the list.

Does this need validation that there are enough partexprs_item items
like what is done in get_range_key_properties()? Or is this validated
somewhere else?

6. Comment claims the if test will test something which it does not
seem to test for:

* Redundant key elimination using btree-semantics based tricks.
* Only list and range partitioning use btree operator semantics, so
* skip otherwise. Also, if there are expressions whose value is yet
* unknown, skip this step, because we need to compare actual values
* below.
memset(keyclauses, 0, PARTITION_MAX_KEYS * sizeof(List *));
if (partkey->strategy == PARTITION_STRATEGY_LIST ||
partkey->strategy == PARTITION_STRATEGY_RANGE)

I was expecting this to be skipped when the clauses contained a
non-const, but it does not seem to.

7. Should be "compare them"

* If the leftarg and rightarg clauses' constants are both of the type
* expected by "op" clause's operator, then compare then using the
* latter's comparison function.

But if I look at the code "compare then using the latter's comparison
function." is not true, it seems to use op's comparison function not
rightarg's. With most of the calls op and rightarg are the same, but
not all of them. The function shouldn't make that assumption even if
the args op was always the same as rightarg.

8. remove_redundant_clauses() needs an overview comment of what the
function does.

9. The comment should explain what we would do in the case of key < 3
AND key <= 2 using some examples.

/* try to keep only one of <, <= */

10. Wondering why this loop runs backward?

for (s = BTMaxStrategyNumber; --s >= 0;)

Why not just:

for (s = 0; s < BTMaxStrategyNumber; s++)

I can't see a special reason for it to run backward. It seems unusual,
but if there's a good reason that I've failed to realise then it's
maybe worth a comment.

11. Pleae comment on why *constfalse = true is set here:

if (!chk || s == (BTEqualStrategyNumber - 1))

if (partition_cmp_args(partopfamily, partopcintype, chk, eq, chk,
if (!test_result)
*constfalse = true;
/* discard the redundant key. */
xform[s] = NULL;

Looks like we'd hit this in a case such as: WHERE key = 1 AND key > 1.

Also please add a comment when discarding the redundant key maybe
explain that equality is more useful than the other strategies when
there's an overlap.

David Rowley
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
bms_add_range_fix.patch application/octet-stream 911 bytes

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2017-11-08 04:57:34 Re: UPDATE of partition key
Previous Message Kyotaro HORIGUCHI 2017-11-08 04:14:31 Re: Restricting maximum keep segments by repslots