Re: How to make partitioning scale better for larger numbers of partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to make partitioning scale better for larger numbers of partitions
Date: 2018-07-17 04:37:52
Message-ID: d8904127-a9bc-5d91-adda-16b3c8251f0b@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/07/17 12:14, Ashutosh Bapat wrote:
> On Tue, Jul 17, 2018 at 8:31 AM, Kato, Sho <kato-sho(at)jp(dot)fujitsu(dot)com> wrote:
>> On 2018/07/17 10:49, Amit Langote wrote:
>>> Perhaps, Kato-san only intended to report that the time that planner spends for a partitioned table with 1100 partitions is just too high compared to the time it spends on a non-partitioned table.
>>
>> yes, It is included for the purposes of this comparison.
>>
>> The purpose of this comparison is to find where the partitioning bottleneck is.
>> Using the bottleneck as a hint of improvement, I'd like to bring the performance of partitioned table closer to the performance of unpartitioned table as much as possible.
>>
>
> That's a good thing, but may not turn out to be realistic. We should
> compare performance where partitioning matters and try to improve in
> those contexts. Else we might improve performance in scenarios which
> are never used.
>
> In this case, even if we improve the planning time by 100%, it hardly
> matters since planning time is neglegible compared to the execution
> time because of huge data where partitioning is useful.

While I agree that it's a good idea to tell users to use partitioning only
if the overhead of having the partitioning in the first place is bearable,
especially the planner overhead, this benchmark shows us that even for
what I assume might be fairly commonly occurring queries (select .. from /
update .. / delete from partitioned_table where partkey = ?), planner
spends way too many redundant cycles. Some amount of that overhead will
always remain and planning with partitioning will always be a bit slower
than without partitioning, it's *too* slow right now for non-trivial
number of partitions.

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-07-17 04:37:59 Re: [HACKERS] Restricting maximum keep segments by repslots
Previous Message Michael Paquier 2018-07-17 04:19:49 Re: Refactor documentation for wait events (Was: pgsql: Add wait event for fsync of WAL segments)