Re: Add SPLIT PARTITION/MERGE PARTITIONS commands

From: Dmitry Koval <d(dot)koval(at)postgrespro(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands
Date: 2022-11-28 22:30:14
Message-ID: 5fc93b25-768a-dc84-b14c-183873476f1e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm sorry, I couldn't answer earlier...

1.
> partbounds.c is adding 500+ LOC about checking if proposed partitions
> meet the requirements (don't overlap, etc). But a lot of those
> checks must already happen, no? Can you re-use/refactor the existing
> checks ?

I a bit reduced the number of lines in partbounds.c and added comments.
Unfortunately, it is very difficult to re-use existing checks for other
partitioned tables operations, because mostly part of PostgreSQL
commands works with a single partition.
So for SPLIT/MERGE commands were created new checks for several partitions.

2.
> Also, postgres already supports concurrent DDL (CREATE+ATTACH and
> DETACH CONCURRENTLY). Is it possible to leverage that ?
> (Mostly to reduce the patch size, but also because maybe some cases
> could be concurrent?).

Probably "ATTACH CONCURRENTLY" is not supported?
A few words about "DETACH CONCURRENTLY".
"DETACH CONCURRENTLY" can works because this command not move rows
during detach partition (and so no reason to block detached partition).
"DETACH CONCURRENTLY" do not changes data, but changes partition
description (partition is marked as "inhdetachpending = true" etc.).

For SPLIT and MERGE the situation is completely different - these
commands transfer rows between sections.
Therefore partitions must be LOCKED EXCLUSIVELY during rows transfer.
Probably we can use concurrently partitions not participating in SPLIT
and MERGE.
But now PostgreSQL has no possibilities to forbid using a part of
partitions of a partitioned table (until the end of data transfer by
SPLIT/MERGE commands).
Simple locking is not quite suitable here.
I see only one variant of SPLIT/MERGE CONCURRENTLY implementation that
can be realized now:

* ShareUpdateExclusiveLock on partitioned table;
* AccessExclusiveLock on partition(s) which will be deleted and will be
created during SPLIT/MEGRE command;
* transferring data between locked sections; operations with non-blocked
partitions are allowed;
* sessions which want to use partition(s) which will be deleted, waits
on locks;
* finally we release AccessExclusiveLock on partition(s) which will be
deleted and delete them;
* waiting sessions will get errors "relation ... does not exist" (we can
transform it to "relation structure was changed ... please try again"?).

It doesn't look pretty.
Therefore for the SPLIT/MERGE command the partitioned table is locked
with AccessExclusiveLock.

3.
> An UPDATE on a partitioned table will move tuples from one partition
> to another. Is there a way to re-use that?

This could be realized using methods that are called from
ExecCrossPartitionUpdate().
But using these methods is more expensive than the current
implementation of the SPLIT/MERGE commands.
SPLIT/MERGE commands uses "bulk insert" and there is low overhead for
finding a partition to insert data: for MERGE is not need to search
partition; for SPLIT need to use simple search from several partitions
(listed in the SPLIT command).
Below is a test example.

a. Transferring data from the table "test2" to partitions "partition1"
and "partition2" using the current implementation of tuple routing in
PostgreSQL:

CREATE TABLE test (a int, b char(10)) PARTITION BY RANGE (a);
CREATE TABLE partition1 PARTITION OF test FOR VALUES FROM (10) TO (20);
CREATE TABLE partition2 PARTITION OF test FOR VALUES FROM (20) TO (30);
CREATE TABLE test2 (a int, b char(10));
INSERT INTO test2 (a, b) SELECT 11, 'a' FROM generate_series(1, 1000000);
INSERT INTO test2 (a, b) SELECT 22, 'b' FROM generate_series(1, 1000000);
INSERT INTO test(a, b) SELECT a, b FROM test2;
DROP TABLE test2;
DROP TABLE test;

Three attempts (the results are little different), the best result:

INSERT 0 2000000
Time: 4467,814 ms (00:04,468)

b. Transferring data from the partition "partition0" to partitions
"partition 1" and "partition2" using SPLIT command:

CREATE TABLE test (a int, b char(10)) PARTITION BY RANGE (a);
CREATE TABLE partition0 PARTITION OF test FOR VALUES FROM (0) TO (30);
INSERT INTO test (a, b) SELECT 11, 'a' FROM generate_series(1, 1000000);
INSERT INTO test (a, b) SELECT 22, 'b' FROM generate_series(1, 1000000);
ALTER TABLE test SPLIT PARTITION partition0 INTO
(PARTITION partition0 FOR VALUES FROM (0) TO (10),
PARTITION partition1 FOR VALUES FROM (10) TO (20),
PARTITION partition2 FOR VALUES FROM (20) TO (30));
DROP TABLE test;

Three attempts (the results are little different), the best result:

ALTER TABLE
Time: 3840,127 ms (00:03,840)

So the current implementation of tuple routing is ~16% slower than the
SPLIT command.
That's quite a lot.

With best regards,
Dmitry Koval

Postgres Professional: http://postgrespro.com

Attachment Content-Type Size
v14-0001-ALTER-TABLE-MERGE-PARTITIONS-command.patch text/plain 104.2 KB
v14-0002-ALTER-TABLE-SPLIT-PARTITION-command.patch text/plain 175.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-11-28 22:31:50 Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas
Previous Message Nikita Malakhov 2022-11-28 22:27:24 Re: [PATCH] Infinite loop while acquiring new TOAST Oid