Re: How to make partitioning scale better for larger numbers of partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to make partitioning scale better for larger numbers of partitions
Date: 2018-07-17 01:48:48
Message-ID: 4a7d783e-71bb-0db2-1486-9d08832c1bb5@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/07/13 22:10, Ashutosh Bapat wrote:
> On Fri, Jul 13, 2018 at 9:23 AM, Kato, Sho <kato-sho(at)jp(dot)fujitsu(dot)com> wrote:
>>> I wondered if you compared to PG10 or to inheritence-partitioning (parent with relkind='r' and either trigger or rule or >INSERT/UPDATE directly into child) ?
>>
>> Thank you for your reply.
>>
>> I compared to PG11beta2 with non-partitioned table.
>>
>> Non-partitioned table has 1100 records in one table.
>> Partitioned table has one record on each leaf partitions.
>>
>
> I don't think partitioning should be employed this way even for the
> sake of comparison. Depending upon the size of each tuple, 1100 tuples
> are inserted into a single table, they will probably occupy few
> hundred pages. In a partitioned table with one tuple per partition
> they will occupy 1100 pages at least. There is other space, locking
> overheads to maintain 1100 tables. I think the right way to compare is
> to have really large that which really requires 1100 partitions and
> then compare performance by putting that data in 1100 partitions and
> in an unpartitioned table. Even with that kind of data, you will see
> some difference in performance, but that won't be as dramatic as you
> report.
>
> I might be missing something though.

Perhaps, Kato-san only intended to report that the time that planner
spends for a partitioned table with 1100 partitions is just too high
compared to the time it spends on a non-partitioned table. It is and has
been clear that that's because the planning time explodes as the number of
partitions increases.

If there's lots of data in it, then the result will look completely
different as you say, because scanning a single partition (of the 1100
total) will spend less time than scanning a non-partitioned table
containing 1100 partitions worth of data. But the planning time would
still be more for the partitioned table, which seems to be the point of
this benchmark.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2018-07-17 01:49:22 Re: Make foo=null a warning by default.
Previous Message Edmund Horner 2018-07-17 01:44:59 Re: PATCH: psql tab completion for SELECT