Re: Constraint exclusion for partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Constraint exclusion for partitioned tables
Date: 2017-09-13 07:07:59
Message-ID: CAFjFpReMerGZ4hbRe=EQdxccFgcvt8EXTksjQbUySD6u69DkjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2017 at 12:17 PM, Jeevan Chalke
<jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>
>
> On Tue, Sep 12, 2017 at 8:12 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> On Tue, Sep 12, 2017 at 7:08 AM, Jeevan Chalke
>> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
>> > This patch clearly improves the planning time with given conditions.
>> >
>> > To verify that, I have created a table like:
>> > create table foo(a int, b int check (b > 100), c text) partition by
>> > range(a);
>> > And then used following query to get planning time:
>> > select * from foo where b < 100;
>> >
>> > And on my local setup, I have observed that,
>> > For 16 partitions, planning time was 0.234692 ms, which reduced to
>> > 0.112948
>> > ms with this patch.
>> > For 128 partitions, planning time was 1.62305 ms, which reduced to
>> > 0.654252
>> > ms with this patch.
>> > For 1024 partitions, planning time was 18.720993 ms, which reduced to
>> > 9.667395 ms with this patch.
>> >
>> > This clearly shows an improvement in planning time.
>>
>> What about the extra cost of checking the parent when it doesn't help?
>> In that case we will have some loss.
>>
>> I'm inclined to think that's OK, but it's something to think about.
>
>
> I have updated query like:

Thanks a lot Jeevan for all your experiments. They are very useful.

> select * from foo where b > 100;
> Which matches with the CHECK constraint, and here are the result on my local
> setup:

So, in this case, constraint exclusion fails since the WHERE condition
can not be refuted by the constraints.

>
> Time in milliseconds
> Partitions | without patch | with patch
> -----------|---------------|------------
> 2 | 0.072551 | 0.074154
> 4 | 0.102537 | 0.108024
> 8 | 0.162703 | 0.175017
> 16 | 0.288589 | 0.305285
> 128 | 2.7119 | 2.636247
> 1024 | 29.101347 | 29.48275
>
> So yes, as you said, it will have slight (may be negligible) overhead.
>
> This observation are from local setup and I have also seen a large standard
> deviation in the runs.

For a regular table if the constraint exclusion fails, we will waste
those many CPU cycles. But if the relation is excluded we will save
disk I/O or buffer access and time to apply the conditions on all the
rows in the relation. Given the magnitude of difference in the time to
run constraint exclusion and time for all those things, we take the
hit and run constraint exclusion always.

For a partitioned table, this patch saves the time to run constraint
exclusion on all the partitions if constraint exclusion succeeds on
the partitioned table. If constraint exclusion fails, we have wasted
CPU cycles on one run of constraint exclusion. The difference between
the time spent in the two scenarios increases with the number of
partitions. Practically, users will have a handful partitions rather
than a couple and thus running overhead of running constraint
exclusion on partitioned table would be justified given the time it
will save when CE succeeds.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-09-13 07:16:32 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Prabhat Sahu 2017-09-13 07:04:28 Re: Parallel Hash take II