Re: Proposal: Automatic partition creation

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Proposal: Automatic partition creation
Date: 2020-07-13 21:14:54
Message-ID: 524595d3-4a87-274a-e1b4-eeab65801af2@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06.07.2020 17:59, Justin Pryzby wrote:
> I think you'd want to have an
> ALTER command for that (we would use that to change tables between
> daily/monthly based on their current size). That should also support setting
> the MODULUS of a HASH partitioned table, to allow changing the size of its
> partitions (currently, the user would have to more or less recreate the table
> and move all its data into different partitions, but that's not ideal).
New syntax fits to the ALTER command as well.

ALTER TABLE tbl
PARTITION BY HASH (number)
USING (partition_desc)

In simple cases (i.e. range partitioning granularity), it will simply
update
the rule of bound generation, saved in the catalog. More complex hash
partitions will require some rebalancing. Though, the syntax is pretty
straightforward for all cases. In the next versions, we can also add a
CONCURRENTLY keyword to cover partitioning of an existing
non-partitioned table
with data.

> I don't know if it's important for anyone, but it would be interesting to think
> about supporting sub-partitioning: partitions which are themselvese partitioned.
> Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
> You'd need to specify how to partition each layer of the heirarchy. In the
> most general case, it could be different partition strategy.

I suppose it will be a natural extension of this work. Now we need to
ensure
that the proposed syntax is extensible. Greenplum syntax, which I choose
as an
example, provides subpartition syntax as well.

> If you have a callback function for partition renaming, I think you'd want to
> pass it not just the current name of the partition, but also the "VALUES" used
> in partition creation. Like (2020-04-05)TO(2020-05-06). Maybe instead, we'd
> allow setting a "format" to use to construct the partition name. Like
> "child.foo_bar_%Y_%m_%d". Ideally, the formats would be fixed-length
> (zero-padded, etc), so failures with length can happen at "parse" time of the
> statement and not at "run" time of the creation. You'd still have to handle
> the case that the name already exists but isn't a partition (or is a partition
> by doesn't handle the incoming tuple for some reason).

In callback design, I want to use the best from pg_pathman's
set_init_callback().
The function accepts jsonb argument, which contains all the data about the
parent table, bounds, and so on. This information can be used to
construct name
for the partition and generate RENAME statement.

> Also, maybe your "configuration" syntax would allow specifying other values.
> Maybe including a retention period (as an INTERVAL for RANGE tables). That's
> useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.
In this version, I got rid of the 'configuration' keyword. Speaking of
retention, I think that it would be hard to cover all use-cases with a
declarative syntax. While it is relatively easy to implement deletion
within a
callback function. See rotation_callback example in pg_pathman [1].

[1]
https://github.com/postgrespro/pg_pathman/blob/79e11d94a147095f6e131e980033018c449f8e2e/sql/pathman_callbacks.sql#L107

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2020-07-13 21:18:37 Re: pg_dump bug for extension owned tables
Previous Message Robert Haas 2020-07-13 21:12:18 recovering from "found xmin ... from before relfrozenxid ..."