Re: speeding up planning with partitions

From: Floris Van Nee <florisvannee(at)Optiver(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2019-04-04 18:33:55
Message-ID: 1554402835150.73386@Optiver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

First of all I would like to thank everyone involved in this patch for their hard work on this. This is a big step forward. I've done some performance and functionality testing with the patch that was committed to master and it looks very good.
I had a question about the performance of pruning of functions like now() and current_date. I know these are handled differently, as they cannot be excluded during the first phases of planning. However, curerntly, this new patch makes the performance difference between the static timestamp variant and now() very obvious (even more than before). Consider
select * from partitioned_table where ts >= now()
or
select * from partitioned_table where ts >= '2019-04-04'

The second plans in less than a millisecond, whereas the first takes +- 180ms for a table with 1000 partitions. Both end up with the same plan.

I'm not too familiar with the code that handles this, but is there a possibility for improvement in this area? Or is the stage at which exclusion for now()/current_date occurs already too far in the process to make any good improvements to this? My apologies if this is considered off-topic for this patch, but I ran into this issue specifically when I was testing this patch, so I thought I'd ask here about it. I do think a large number of use-cases for tables with a large number of partitions involve a timestamp for partition key, and naturally people will start writing queries for this that use functions such as now() and current_date.

Thanks again for your work on this patch!

-Floris

________________________________________
From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Sent: Tuesday, April 2, 2019 7:50 AM
To: Tom Lane
Cc: David Rowley; Imai Yoshikazu; jesper(dot)pedersen(at)redhat(dot)com; Imai, Yoshikazu; Amit Langote; Alvaro Herrera; Robert Haas; Justin Pryzby; Pg Hackers
Subject: Re: speeding up planning with partitions [External]

Thanks for taking a look.

On 2019/04/02 2:34, Tom Lane wrote:
> Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
>> On 2019/03/30 0:29, Tom Lane wrote:
>>> That seems like probably an independent patch --- do you want to write it?
>
>> Here is that patch.
>> It revises get_relation_constraints() such that the partition constraint
>> is loaded in only the intended cases.
>
> So I see the problem you're trying to solve here, but I don't like this
> patch a bit, because it depends on root->inhTargetKind which IMO is a
> broken bit of junk that we need to get rid of. Here is an example of
> why, with this patch applied:
>
> regression=# create table p (a int) partition by list (a);
> CREATE TABLE
> regression=# create table p1 partition of p for values in (1);
> CREATE TABLE
> regression=# set constraint_exclusion to on;
> SET
> regression=# explain select * from p1 where a = 2;
> QUERY PLAN
> ------------------------------------------
> Result (cost=0.00..0.00 rows=0 width=0)
> One-Time Filter: false
> (2 rows)
>
> So far so good, but watch what happens when we include the same case
> in an UPDATE on some other partitioned table:
>
> regression=# create table prtab (a int, b int) partition by list (a);
> CREATE TABLE
> regression=# create table prtab2 partition of prtab for values in (2);
> CREATE TABLE
> regression=# explain update prtab set b=b+1 from p1 where prtab.a=p1.a and p1.a=2;
> QUERY PLAN
> ---------------------------------------------------------------------------
> Update on prtab (cost=0.00..82.30 rows=143 width=20)
> Update on prtab2
> -> Nested Loop (cost=0.00..82.30 rows=143 width=20)
> -> Seq Scan on p1 (cost=0.00..41.88 rows=13 width=10)
> Filter: (a = 2)
> -> Materialize (cost=0.00..38.30 rows=11 width=14)
> -> Seq Scan on prtab2 (cost=0.00..38.25 rows=11 width=14)
> Filter: (a = 2)
> (8 rows)
>
> No constraint exclusion, while in v10 you get
>
> Update on prtab (cost=0.00..0.00 rows=0 width=0)
> -> Result (cost=0.00..0.00 rows=0 width=0)
> One-Time Filter: false
>
> The reason is that this logic supposes that root->inhTargetKind describes
> *all* partitioned tables in the query, which is obviously wrong.
>
> Now maybe we could make it work by doing something like
>
> if (rel->reloptkind == RELOPT_BASEREL &&
> (root->inhTargetKind == INHKIND_NONE ||
> rel->relid != root->parse->resultRelation))

Ah, you're right. inhTargetKind has to be checked in conjunction with
checking whether the relation is the target relation.

> but I find that pretty messy, plus it's violating the concept that we
> shouldn't be allowing messiness from inheritance_planner to leak into
> other places.

I'm afraid that we'll have to live with this particular hack as long as we
have inheritance_planner(), but we maybe could somewhat reduce the extent
to which the hack is spread into other planner files.

How about we move the part of get_relation_constraints() that loads the
partition constraint to its only caller
relation_excluded_by_constraints()? If we do that, all uses of
root->inhTargetKind will be confined to one place. Attached updated patch
does that.

> What I'd rather do is have this test just read
>
> if (rel->reloptkind == RELOPT_BASEREL)
>
> Making it be that way causes some changes in the partition_prune results,
> as attached, which suggest that removing the enable_partition_pruning
> check as you did wasn't such a great idea either. However, if I add
> that back in, then it breaks the proposed new regression test case.
>
> I'm not at all clear on what we think the interaction between
> enable_partition_pruning and constraint_exclusion ought to be,
> so I'm not sure what the appropriate resolution is here. Thoughts?

Prior to 428b260f87 (that is, in PG 11), partition pruning for UPDATE and
DELETE queries is realized by applying constraint exclusion to the
partition constraint of the target partition. The conclusion of the
discussion when adding the enable_partition_pruning GUC [1] was that
whether or not constraint exclusion is carried out (to facilitate
partition pruning) should be governed by the new GUC, not
constraint_exclusion, if only to avoid confusing users.

428b260f87 has obviated the need to check enable_partition_pruning in
relation_excluded_by_constraints(), because inheritance_planner() now runs
the query as if it were SELECT, which does partition pruning using
partprune.c, governed by the setting of enable_partition_pruning. So,
there's no need to check it again in relation_excluded_by_constraints(),
because we won't be consulting the partition constraint again; well, at
least after applying the proposed patch.

> BTW, just about all the other uses of root->inhTargetKind seem equally
> broken from here; none of them are accounting for whether the rel in
> question is the query target.

There's only one other use of its value, AFAICS:

switch (constraint_exclusion)
{
case CONSTRAINT_EXCLUSION_OFF:

/*
* Don't prune if feature turned off -- except if the relation is
* a partition. While partprune.c-style partition pruning is not
* yet in use for all cases (update/delete is not handled), it
* would be a UI horror to use different user-visible controls
* depending on such a volatile implementation detail. Therefore,
* for partitioned tables we use enable_partition_pruning to
* control this behavior.
*/
if (root->inhTargetKind == INHKIND_PARTITIONED)
break;

Updated patch removes it though. Which other uses are there?

Attached patch is only for HEAD this time. I'll post one for PG 11 (if
you'd like) once we reach consensus on the best thing to do here is.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/flat/CAFjFpRcwq7G16J_w%2Byy_xiE7daD0Bm6iYTnhz81f79yrSOn4DA%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shawn Debnath 2019-04-04 18:52:52 Re: Retronym: s/magnetic disk/main data/
Previous Message Alvaro Herrera 2019-04-04 18:26:30 Re: query logging of prepared statements