Proposal: Automatic partition creation

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Proposal: Automatic partition creation
Date: 2020-07-06 10:45:52
Message-ID: 7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The previous discussion of automatic partition creation [1] has
addressed static and dynamic creation of partitions and ended up with
several syntax proposals.
In this thread, I want to continue this work.

Attached is PoC for static partition creation. The patch core is quite
straightforward. It adds one more transform clause to convert given
partitioning specification into several CREATE TABLE statements.

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL
range_step FROM range_start TO range_end

For more examples check auto_partitions.sql in the patch.

TODO:

- CONFIGURATION is just an existing keyword, that I picked as a stub.
 Ideas on better wording are welcome.

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can
rather add these keywords later.

- HASH and LIST static partitioning works as expected.
Testing and feedback are welcome.

- RANGE partitioning is not really implemented in this patch.
Now it only accepts interval data type as 'interval' and respectively
date types as range_start and range_end expressions.
Only one partition is created. I found it difficult to implement the
generation of bounds using internal functions and data types.
Both existing solutions (pg_pathman and pg_partman) rely on SQL level
routines [2].
I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?

- Partition naming. Now partition names for all methods look like
$tablename_$partnum
Do we want more intelligence here? Now we have
RunObjectPostCreateHook(), which allows to rename the table.
To make it more user-friendly, we can later implement pl/pgsql function
that sets the callback, as it is done in pg_pathman set_init_callback() [3].

- Current design doesn't allow to create default partition
automatically. Do we need this functionality?

- Do you see any restrictions for future extensibility (dynamic
partitioning, init_callback, etc.) in the proposed design ?

I expect this to be a long discussion, so here is the wiki page [4] to
fix important questions and final agreements.

[1]
https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1907150711080.22273%40lancre
[2]
https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99
[3] https://github.com/postgrespro/pg_pathman#additional-parameters
[4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-WIP-create-partitions-automatically.patch text/x-patch 29.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-07-06 11:45:41 Re: Resetting spilled txn statistics in pg_stat_replication
Previous Message yuzuko 2020-07-06 10:35:37 Re: Autovacuum on partitioned table (autoanalyze)