Re: How to make partitioning scale better for larger numbers of partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to make partitioning scale better for larger numbers of partitions
Date: 2018-07-13 06:28:52
Message-ID: 212f03bb-69e7-3ad2-e761-9ba8859cc2eb@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/07/13 14:49, Tsunakawa, Takayuki wrote:
> From: Amit Langote [mailto:Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp]
>> For SELECT/UPDATE/DELETE, overhead of partitioning in the planning phase
>> is pretty significant and gets worse as the number of partitions grows.
>> I
>> had intended to fix that in PG 11, but we could only manage to get part
>> of
>> that work into PG 11, with significant help from David and others. So,
>> while PG 11's overhead of partitioning during planning is less than PG
>> 10's due to improved pruning algorithm, it's still pretty far from ideal,
>> because it isn't just the pruning algorithm that had overheads. In fact,
>> PG 11 only removes the pruning overhead for SELECT, so UPDATE/DELETE still
>> carry the overhead that was in PG 10.
>
> David has submitted multiple patches for PG 12, one of which speeds up pruning of UPDATE/DELETE (I couldn't find it in the current CF, though.)

I don't think he has posted a new patch for improving the speed for
UDPATE/DELETE planning yet, although I remember he had posted a PoC patch
back in February or March.

> What challenges are there for future versions, and which of them are being addressed by patches in progress for PG 12, and which issues are untouched?

The immediate one I think is to refactor the planner such that the new
pruning code, that we were able to utilize for SELECT in PG 11, can also
be used for UPDATE/DELETE. Refactoring needed to replace the pruning
algorithm was minimal for SELECT, but not so much for UPDATE/DELETE.

Once we're able to utilize the new pruning algorithm for all the cases, we
can move on to refactoring to avoid locking and opening of all partitions.
As long as we're relying on constraint exclusion for partition pruning,
which we still do for UPDATE/DELETE, we cannot do that because constraint
exclusion has to look at each partition individually.

The UPDATE/DELETE planning for partitioning using huge memory and CPU is a
pretty big issue and refactoring planner to avoid that may be what's
hardest of all the problems to be solved here.

>> The overheads I mention stem from
>> the fact that for partitioning we still rely on the old planner code
>> that's used to perform inheritance planning, which requires to lock and
>> open *all* partitions. We have so far been able to refactor just enough
>> to use the new code for partition pruning, but there is much refactoring
>> work left to avoid needlessly locking and opening all partitions.
>
> I remember the issue of opening and locking all partitions was discussed before.

Quite a few times I suppose.

> Does this relate only to the planning phase?

If the benchmark contains queries that will need to access just one
partition, then yes the planning part has is the biggest overhead.

Execution-time overhead is limited to having an extra, possibly needless,
Append node, but I know David has patch for that too.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2018-07-13 06:53:05 RE: How to make partitioning scale better for larger numbers of partitions
Previous Message David Rowley 2018-07-13 06:25:06 Re: How to make partitioning scale better for larger numbers of partitions