Re: Syntax for partitioning

From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Syntax for partitioning
Date: 2009-11-02 06:50:45
Message-ID: 20091102155044.C6B9.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are details of partitioning syntax.

-----------------
Syntax overview
-----------------
Partitions are defined with 3 steps:
1. Create a plain table as parent.
2. Set a partition key to a table.
3. Add a partition to a table which has a partition key.

i.e.,
CREATE TABLE table (...) PARTITION BY { RANGE | LIST } ( key ) (...)
is just an abbreviated form of:
1. CREATE TABLE table (...);
2. ALTER TABLE table PARTITION BY { RANGE | LIST } ( key );
3. CREATE PARTITION name ON table VALUES ...;

Currently RANGE and LIST partitions are supported.
No reserved keywords are required by the syntax, and that's why
ALTER TABLE ADD PARTITION cannot be used here instead of CREATE PARTITION.

HASH partitions are not supported, but we can use LIST paritions with
an expression key as incomplete HASH partitions:
CREATE TABLE table (...) PARTITION BY LIST ( hashtext(attr) ) (...);
SELECT * FROM table WHERE hashtext(attr) = hashtext('search_key');

------------------------------------------
Features *NOT* included in this proposal
------------------------------------------
To simplify patch, the following features are not included:
- Partition triggers to dispatch rows inserted into parent table
- Expanding some commands for a parent to partitions (ex. VACUUM)
- ALTER commands except RENAME (ex. MERGE, SPLIT, UPDATE)
- Ability to add an existing table to a parent as a partition
(ex. ALTER TABLE table INHERIT parent AS PARTITION)
- Planner and Executor improvements

I have plans to implement some of them in separated patches, but not now.

-----------------
Catalog changes
-----------------
A new system catalog "pg_partition" represents partition keys for each
table. A parent table of partitions has only one pg_partition row.
I think separated pg_partition table is better than adding these columns
to pg_class, but it might be debatable.

CREATE TABLE pg_partition (
partrelid oid REFERENCES oid ON pg_class, -- partitioned table oid
partopr oid REFERENCES oid ON pg_operator, -- operator to comapre keys
partkind "char", -- kind of partition: 'R' (range) or 'L' (list)
partkey text, -- expression tree of partition key
PRIMARY KEY (partrelid)
) WITHOUT OIDS;

In addition, we would need to store threshold values of child tables
somewhere, but under consideration. I'm thinking to extract upper
and lower bounds from CHECK constraint, but it might be unreliable.
Comments and ideas welcome.

----------------
Syntax details
----------------
CREATE TABLE table (...) PARTITION BY RANGE ( expr [USING operator] )
[ (
PARTITION name VALUES LESS THAN [(] upper [)],
PARTITION name VALUES LESS THAN [(] MAXVALUE [)] -- overflow partition
) ] ;
-- default operator is '<' for range partitions

CREATE TABLE table (...) PARTITION BY LIST ( expr [USING operator] )
[ (
PARTITION name VALUES [IN] ( values ),
PARTITION name VALUES [IN] [(] DEFAULT [)] -- overflow partition
) ] ;
-- default operator is '=' for list partitions

ALTER TABLE table PARTITION BY { RANGE | LIST } ... ;
ALTER TABLE table NO PARTITION; -- drop partition key

CREATE PARTITION partition ON table VALUES LESS THAN ...; -- range
CREATE PARTITION partition ON table VALUES IN (...); -- list

DROP PARTITION partition; -- synonym for DROP TABLE
ALTER PARTITION partition RENAME TO name; -- synonym for ALTER TABLE RENAME

Note:
* Each partition can have optional WITH (...) and TABLESPACE clauses.
* '(' and ')' are optional to support both Oracle and MySQL syntax.

-----------
WIP patch
-----------
The attached partitioning_20091102.patch is a WIP patch. There are
still not implemented features marked with TODO tags, but I'll use
this design -- especially Node manipulations.
Please notice me if I'm missing something.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
partitioning_20091102.patch application/octet-stream 68.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2009-11-02 06:51:29 backup_label in a crash recovery
Previous Message Toru SHIMOGAKI 2009-11-02 06:40:25 Remove backup_label automatically