Re: [HACKERS] path toward faster partition pruning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] path toward faster partition pruning
Date: 2018-02-06 09:55:10
Message-ID: 4850ab2a-7f2e-bc32-1dc4-0aec81717851@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/02/03 6:05, Robert Haas wrote:
> On Fri, Feb 2, 2018 at 9:33 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Updated set of patches attached (patches 0002 onward mostly unchanged,
>>> except I incorporated the delta patch posted by David upthread).
>>
>> Committed 0001. Thanks.
>
> Some preliminary thoughts...

Thanks for the review.

> Regarding 0002, I can't help noticing that this adds a LOT of new code
> to partition.c. With 0002 applied, it climbs into the top 2% of all
> ".c" files in terms of lines of code. It seems to me, though, that
> maybe it would make sense to instead add all of this code to some new
> file .c file, e.g. src/backend/optimizer/util/partprune.c. I realize
> that's a little awkward in this case because we're hoping to use this
> code at runtime and not just in the optimizer, but I don't have a
> better idea. Using src/backend/catalog as a dumping-ground for code
> that doesn't have a clear-cut place to live is not a superior
> alternative, for sure.

Agreed. partition.c has gotten quite big and actually more than half of
the code that this patch adds really seems to belong outside of it.

> And it seems to me that the code you're adding
> here is really quite similar to what we've already got in that
> directory -- for example, predtest.c, which currently does partition
> pruning, lives there; so does clauses.c, whose evaluate_expr facility
> this patch wants to use; so does relnode.c, which the other patches
> modify; and in general this looks an awful lot like other optimizer
> logic that decomposes clauses. I'm open to other suggestions but I
> don't think adding all of this directly into partition.c is a good
> plan.

Agreed.

A partprune.c in the optimizer's util directory seems like a good place.

> If we do add a new file for this code, the header comment for that
> file would be a good place to write an overall explanation of this new
> facility. The individual bits and pieces seem to have good comments
> but an overall explanation of what's going on here seems to be
> lacking.

OK, I will add such a comment.

> It doesn't seem good that get_partitions_from_clauses requires us to
> reopen the relation. I'm going to give my standard review feedback
> any time someone injects additional relation_open or heap_open calls
> into a patch: please look for a way to piggyback on one of the places
> that already has the relation open instead of adding code to re-open
> it elsewhere. Reopening it is not entirely free, and, especially when
> NoLock is used, makes it hard to tell whether we're doing the locking
> correctly. Note that we've already got things like
> set_relation_partition_info (which copies the bounds) and
> set_baserel_partition_key_exprs (which copies, with some partitioning,
> the partitioning expressions) and also find_partition_scheme, but
> instead of using that existing data from the RelOptInfo, this patch is
> digging it directly out of the relcache entry, which doesn't seem
> great.

OK, I have to admit that the quoted heap_open wasn't quite well thought
out and I'm almost sure that everything should be fine with the
information that set_relation_partition_info() fills in the RelOptInfo.
I'm now going through the patch to try to figure out how to make that work.

> The changes to set_append_rel_pathlist probably make it slower in the
> case where rte->relkind != RELKIND_PARTITIONED_TABLE. We build a
> whole new list that we don't really need. How about just keeping the
> (appinfo->parent_relid != parentRTindex) test in the loop and setting
> rel_appinfos to either root->append_rel_list or
> rel->live_part_appinfos as appropriate?

That's certainly better. Also in set_append_rel_size.

> I understand why COLLATION_MATCH think that a collation OID match is
> OK, but why is InvalidOid also OK? Can you add a comment? Maybe some
> test cases, too?

partcollid == InvalidOid means the partition key is of uncollatable type,
so further checking the collation is unnecessary.

There is a test in partition_prune.sql that covers the failure to prune
when collations don't match for a text partition key.

> How fast is this patch these days, compared with the current approach?
> It would be good to test both when nearly all of the partitions are
> pruned and when almost none of the partitions are pruned.

I will include some performance numbers in my next email, which hopefully
should not be later than Friday this week.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2018-02-06 10:07:45 Re: [HACKERS] MERGE SQL Statement for PG11
Previous Message Andreas Joseph Krogh 2018-02-06 09:49:33 Sv: Better Upgrades