Declarative partitioning - another take

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Declarative partitioning - another take
Date: 2016-08-10 11:09:42
Message-ID: ad16e2f5-fc7c-cc2d-333a-88d4aa446f96@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached is the latest set of patches to implement declarative
partitioning. There is already a commitfest entry for the same:
https://commitfest.postgresql.org/10/611/

The old discussion is here:
http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp/

Attached patches are described below:

0001-Catalog-and-DDL-for-partition-key.patch
0002-psql-and-pg_dump-support-for-partitioned-tables.patch

These patches create the infrastructure and DDL for partitioned
tables.

In addition to a catalog for storing the partition key information, this
adds a new relkind to pg_class.h. PARTITION BY clause is added to CREATE
TABLE. Tables so created are RELKIND_PARTITIONED_REL relations which are
to be special in a number of ways, especially with regard to their
interactions with regular table inheritance features.

PARTITION BY RANGE ({ column_name | ( expression ) } [ opclass ] [, ...])
PARTITION BY LIST ({ column_name | ( expression ) } [ opclass ])

0003-Catalog-and-DDL-for-partition-bounds.patch
0004-psql-and-pg_dump-support-for-partitions.patch

These patches create the infrastructure and DDL for partitions.

Parent-child relationships of a partitioned table and its partitions are
managed behind-the-scenes with inheritance. That means there is a
pg_inherits entry and attributes, constraints, etc. are marked with
inheritance related information appropriately. However this case differs
from a regular inheritance relationship in a number of ways. While the
regular inheritance imposes certain restrictions on what elements a
child's schema is allowed to contain (both at creation time and
after-the-fact), the partitioning related code imposes further
restrictions. For example, while regular inheritance allows a child to
contain its own columns, the partitioning code disallows that. Stuff like
NO INHERIT marking on check constraints, ONLY are ignored by the the
partitioning code.

Partition DDL includes both a way to create new partition and "attach" an
existing table as a partition of parent partitioned table. Attempt to
drop a partition using DROP TABLE causes an error. Instead a partition
needs first to be "detached" from parent partitioned table. On the other
hand, dropping the parent drops all the partitions if CASCADE is specified.

CREATE TABLE partition_name
PARTITION OF parent_table [ (
{ column_name WITH OPTIONS [ column_constraint [ ... ] ]
| table_constraint }
[, ... ]
) ] partition_bound_spec
[ PARTITION BY {RANGE | LIST} ( { column_name | ( expression ) } [ opclass
] [, ...] )

CREATE FOREIGN TABLE [ IF NOT EXISTS ] table_name
PARTITION OF parent_table [ (
{ column_name WITH OPTIONS [ column_constraint [ ... ] ]
| table_constraint }
[, ... ]
) ] partition_bound_spec
SERVER server_name
[ OPTIONS ( option 'value' [, ... ] ) ]

ALTER TABLE parent ATTACH PARTITION partition_name partition_bound_spec [
VALIDATE | NO VALIDATE ]

ALTER TABLE parent DETACH PARTITION partition_name

partition_bound_spec is:

FOR VALUES { list_spec | range_spec }

list_spec in FOR VALUES is:

IN ( expression [, ...] )

range_spec in FOR VALUES is:

START lower-bound [ INCLUSIVE | EXCLUSIVE ] END upper-bound [ INCLUSIVE |
EXCLUSIVE ]

where lower-bound and upper-bound are:

{ ( expression [, ...] ) | UNBOUNDED }

expression can be a string literal, a numeric literal or NULL.

Note that the one can specify PARTITION BY when creating a partition
itself. That is to allow creating multi-level partitioned tables.

0005-Teach-a-few-places-to-use-partition-check-constraint.patch

A partition's bound implicitly constrains the values that are allowed in
the partition key of its rows. The same can be applied to partitions when
inserting data *directly* into them to make sure that only the correct
data is allowed in (if a tuple has been routed from the parent, that
becomes unnecessary). To that end, ExecConstraints() now includes the
above implicit check constraint in the list of constraints it enforces.

Further, to enable constraint based partition exclusion on partitioned
tables, the planner code includes in its list of constraints the above
implicitly defined constraints. This arrangement is temporary however and
will be rendered unnecessary when we implement special data structures and
algorithms within the planner in future versions of this patch to use
partition metadata more effectively for partition exclusion.

Note that the "constraints" referred to above are not some on-disk
structures but those generated internally on-the-fly when requested by a
caller.

0006-Introduce-a-PartitionTreeNode-data-structure.patch
0007-Tuple-routing-for-partitioned-tables.patch

These patches enable routing of tuples inserted into a partitioned table
to one of its leaf partitions. It applies to both COPY FROM and INSERT.
First of these patches introduces a data structure that provides a
convenient means for the tuple routing code to step down a partition tree
one level at a time. The second one modifies copy.c and executor to
implement actual tuple routing. When inserting into a partition, its row
constraints and triggers are applied. Note that the partition's
constraints also include the constraints defined on the parent. This
arrangements means however that the parent's triggers are not currently
applied.

Updates are handled like they are now for inheritance sets, however, if an
update makes a row change partition, an error will be thrown.

0008-Update-DDL-Partitioning-chapter.patch

This patch updates the partitioning section in the DDL chapter to reflect
the new methods made available for creating and managing partitioned table
and its partitions. Especially considering that it is no longer necessary
to define CHECK constraints and triggers/rules manually for constraint
exclusion and tuple routing, respectively.

TODO (in short term):
* Add more regression tests and docs
* Add PartitionOptInfo and use it to perform partition pruning more
effectively (the added infrastructure should also help pairwise joins
patch proposed by Ashutosh Bapat [1])
* Fix internal representation of list partition bounds to be more efficient

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj%3DEaDTSA%40mail.gmail.com

Attachment Content-Type Size
0001-Catalog-and-DDL-for-partitioned-tables.patch text/x-diff 116.2 KB
0002-psql-and-pg_dump-support-for-partitioned-tables.patch text/x-diff 22.8 KB
0003-Catalog-and-DDL-for-partitions.patch text/x-diff 196.3 KB
0004-psql-and-pg_dump-support-for-partitions.patch text/x-diff 17.8 KB
0005-Teach-a-few-places-to-use-partition-check-constraint.patch text/x-diff 17.2 KB
0006-Introduce-a-PartitionTreeNode-data-structure.patch text/x-diff 8.5 KB
0007-Tuple-routing-for-partitioned-tables.patch text/x-diff 39.8 KB
0008-Update-DDL-Partitioning-chapter-to-reflect-new-devel.patch text/x-diff 24.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-08-10 11:10:11 Re: Proposal for CSN based snapshots
Previous Message Amit Langote 2016-08-10 10:29:11 Re: Declarative partitioning