Re: Declarative partitioning grammar

From: Mike <ipso(at)snappymail(dot)ca>
To: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Declarative partitioning grammar
Date: 2008-01-12 00:03:48
Message-ID: 1200096228.31759.19.camel@ipso.snappymail.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pardon my ignorance as I've never actually used partitioning before but
plan to in the near future, but couldn't the grammar resemble a common
WHERE clause more closely?

> ... PARTITION BY RANGE(order_date)
> (
> START (date '2005-12-01') end (date '2007-12-01')
> EVERY(interval '2 months')
> );
>

PARTITION BY RANGE(order_date) ( WHERE order_date >= '2005-12-01' AND order_date < '2007-12-01' EVERY interval '2 months' )

OR

PARTITION BY RANGE(order_date) ( WHERE order_date BETWEEN '2005-12-01' AND '2007-12-01' )

Of course using '>','>=','<','<=' instead of start/end eliminates
any ambiguity along with the need for INCLUSIVE/EXCLUSIVE.

> ... PARTITION BY LIST (state, deptno)
> (
> VALUES ('OR', 1, 'WA', 1),
> VALUES ('AZ', 1, 'UT', 1, 'NM', 1),
> VALUES ('OR', 2, 'WA', 2),
> VALUES ('AZ', 2, 'UT', 2, 'NM', 2),
> PARTITION region_null VALUES (NULL, NULL),
> PARTITION region_other
> );

PARTITION BY LIST (state,deptno) (
PARTITION one WHERE state in ('OR', WA') AND deptno = 1
PARTITION two WHERE state in ('AZ', UT') AND deptno IN (1,2)
PARTITION region_null WHERE state is null OR deptno is NULL
PARTITION region_other
);

Do you even need to list the columns in the PARTITION BY part?

PARTITION BY LIST (
PARTITION one WHERE state in ('OR', WA') AND deptno = 1
PARTITION two WHERE state in ('AZ', UT') AND deptno IN (1,2)
PARTITION region_null WHERE state is null OR deptno is NULL
PARTITION region_other
);

Is there really a reason to not have a named partition as well? Sure it
saves a few keystrokes, but it makes trying to do anything with them at
a later date that much more difficult.

Your originally suggested grammar might be shorter to type, but using
WHERE clause syntax we are all familiar with seems a lot more intuitive
to me on the surface at least. Why not try to reuse grammar that already
exists as much as possible?

On Sat, 2008-01-12 at 00:19 +0100, Gavin Sherry wrote:

> CREATE TABLE is modified to accept a PARTITION BY clause. This clause
> contains one or more partition declarations. The syntax is as follows:
>
> PARTITION BY {partition_type} (column_name[, column_name...])
> [PARTITIONS number]
> (
> partition_declaration[, partition_declaration...]
>
> )

> List
> ----
>
> ... PARTITION BY LIST (state)
> (PARTITION q1_northwest VALUES ('OR', 'WA'),
> PARTITION q1_southwest VALUES ('AZ', 'UT', 'NM'),
> PARTITION q1_northeast VALUES ('NY', 'VM', 'NJ'),
> PARTITION q1_southeast VALUES ('FL', 'GA'),
> PARTITION q1_northcentral VALUES ('SD', 'WI'),
> PARTITION q1_southcentral VALUES ('OK', 'TX'));
>
> Here, we produce 6 different partitions. The first partition groups
> states in the North West of the USA. We introduce here the named
> partition concept for clarity.
>
> Range
> -----
>
> Range has the most expressive grammar. I'll introduce it in steps:
>
> ... PARTITION BY RANGE (b)
> (
> PARTITION aa start (date '2007-01-01') end (date '2008-01-01'),
> PARTITION bb start (date '2008-01-01') end (date '2009-01-01')
> );
>
> Here, we create 2 partitions: aa and bb. Partition aa has the range
> 2007-01-01 to 2008-01-01; partition bb has the range 2008-01-01 to
> 2009-01-01. Intervals always have this problem: are the bounds included
> in the range? To deal with this we define: the start of the range is
> included in the range. The ending bound is not. This can be modified
> with the keywords INCLUSIVE and EXCLUSIVE, which modify this property on
> a rule by rule basis.
>
> It is common that these partitions follow a pattern, such as following
> every week, month or year. So, we support the following specification:
>
> ... PARTITION BY RANGE(order_date)
> (
> START (date '2005-12-01') end (date '2007-12-01')
> EVERY(interval '2 months')
> );
>
> If we like, we can mix the specification a little:
>
> ... PARTITION BY RANGE(order_date)
> ( PARTITION Q1_2005 end (date '2005-04-01'),
> PARTITION Q2_2005 end (date '2005-07-01'),
> PARTITION Q3_2005 end (date '2005-10-10'),
> PARTITION Q4_2005 end (date '2006-01-01'),
> START (date '2006-02-01') end (date '2008-04-01')
> EVERY (interval '2 weeks')
> );
>
> an interesting result of the flexibility of the grammar we've come up
> with is that you can do something like this:
>
> ... PARTITION BY RANGE(order_date)
> ( PARTITION minny end date '2004-12-01'),
> end (date '2006-12-01'),
> PARTITION maxny start (date '2006-12-01')
> );
>
> Here, when order_date is less than 2004-12-01, we put the data in minny,
> when it is between 2004-12-01 and 2006-12-01 we put it in an unnamed
> partition and after this we put it in maxny.
>
> Tablespaces
> -----------
>
> We allow inline tablespace specification, such as:
>
> ... PARTITION BY RANGE(order_date)
> (
> PARTITION minny TABLESPACE compress,
> start (date '2004-12-01') end (date '2006-12-01') TABLESPACE hot,
> PARTITION maxny TABLESPACE compress
> );
>
> I've used the term compress here intentionally. A number of operating
> systems now ship file systems which can compress partitions. Users with
> issues with the amount of data they want to keep online can delay the
> time until they need new storage to a future date by compressing less
> regularly used data with this technique, for a performance cost. Data
> being used heavily can live on an uncompressed file system, affected.
>
> Multi-column support
> --------------------
>
> We can have multi-column partitions.
>
> ... PARTITION BY LIST (state, deptno)
> (
> VALUES ('OR', 1, 'WA', 1),
> VALUES ('AZ', 1, 'UT', 1, 'NM', 1),
> VALUES ('OR', 2, 'WA', 2),
> VALUES ('AZ', 2, 'UT', 2, 'NM', 2),
> PARTITION region_null VALUES (NULL, NULL),
> PARTITION region_other
> );
>
> Looking at this syntax now, I think I prefer:
>
> VALUES ('OR', 1),('WA', 1)
>
> To specify keys for the same partition. Thoughts?
>
> Composite partition support
> ---------------------------
>
> Given that we're talking about systems with potentially very large
> amounts of data, power users may want to combine range partitioning with
> hash or list partitioning. For example, your analysis might always be on
> a date range but also be broken down by sales office. So, this would
> combine range and list partitioning (if the sales offices were known) or
> hash partitioning (if they weren't known).
>
> To do this, we introduce the SUBPARTITION clause:
>
> ... PARTITION BY RANGE(order_date) SUBPARTITION BY HASH (office_id)
> SUBPARTITIONS 8
> (
> start (date '2005-12-01') end (date '2007-12-01')
> every (interval '3 months'),
> start (date '2007-12-01')
> end (date '2008-12-01') every (interval '1 month')
> );
>
> The first partition specification covers 8 partitions, the second 12 for
> 20 partitions in total. Once we add the subpartitioning we have 160
> partitions in total (20 * 8).
>
> Subpartitioning by list can look like this (see templates below):
>
> ... PARTITION BY RANGE(order_date) SUBPARTITION BY LIST (customer_id)
> (
> partition minny (subpartition c1 values (1), subpartition
> c2 values (2)),
> start (date '2004-12-01') end (date '2006-12-01')
> (subpartition c1 values (1), subpartition c2 values (2)),
> partition maxy (values (1), values (2)
> )
>
> So, the list parameters of each sub partition look like arguments to the
> primary partition. Again, see templates below if you think this looks
> cumbersome.
>
> We do not preclude subpartitions of subpartitions. So, the following is
> valid:
>
> ... PARTITION BY HASH(b)
> PARTITIONS 2
> SUBPARTITION BY HASH(d)
> SUBPARTITIONS 2,
> SUBPARTITION BY HASH(e) SUBPARTITIONS 2,
> SUBPARTITION BY HASH(f) SUBPARTITIONS 2,
> SUBPARTITION BY HASH(g) SUBPARTITIONS 2,
> SUBPARTITION BY HASH(h) SUBPARTITIONS 2;
>
> Subpartition templates
> ----------------------
>
> There are times we want subpartitions to be laid out in a specific way
> for all partitions. To do this, we use templates:
>
> ... PARTITION BY RANGE (order_date)
> SUBPARTITION BY LIST (state)
> SUBPARTITION TEMPLATE
> (
> SUBPARTITION northwest VALUES ('OR', 'WA'),
> SUBPARTITION southwest VALUES ('AZ', 'UT', 'NM'),
> SUBPARTITION northeast VALUES ('NY', 'VM', 'NJ'),
> SUBPARTITION southeast VALUES ('FL', 'GA'),
> SUBPARTITION northcentral VALUES ('SD', 'WI'),
> SUBPARTITION southcentral VALUES ('OK', 'TX')
> )
> (start (date '2001-01-01') end (date '2010-01-01')
> every (interval '3 months')
> )
>
> For each of the 36 odd partitions we create here, each is subpartitioned
> into geographical areas.
>
> Data management with ALTER
> --------------------------
>
> These are all arguments to ALTER TABLE. All of these require validation
> against the existing specification.
>
> ADD
> ---
>
> For range and list partitioning, it's important to be able to add
> partitions for data not covered by the existing specification. So, we
> propose:
>
> ... ADD PARTITION q1_2008 end (date '2008-04-01')
>
> COALESCE (maybe)
> ----------------
>
> For hash partitions, remove a partition from the number of hash
> partitions and distribute its data to the remaining partitions.
>
> ... COALESCE PARTITION [name];
>
> I'm not sure if this is really used but other systems we looked at have
> it. Thoughts?
>
> DROP
> ----
>
> For list and range partitions, drop a specified partition from the set
> of partitions.
>
> ... DROP PARTITION minny;
>
> This drops a named partition. Often, it will be difficult for users to
> know partition names, and they might be unnamed. So, we allow this
> syntax:
>
> ... DROP PARTITION FOR(date '2007-01-01');
>
> for range partitions; and:
>
> ... DROP PARTITION FOR(VALUES('CA'));
>
> for list partitions.
>
> We've also discussed something like:
>
> ... DROP PARTITION FOR(POSITION(1));
>
> so that users can easily drop a specific partition in an array of range
> partitions. It seems to me, though, that the use case is generally to
> drop the oldest partition so perhaps we should have a more explicit
> syntax. Thoughts?
>
> EXCHANGE
> --------
>
> This sub-clause allows us to make a table a partition in a set of
> partitions or take a partition out of a set but keep it as a table. IBM
> uses ATTACH and DETACH, which is explicit but Oracle uses EXCHANGE. I'll
> explain the latter:
>
> ... EXCHANGE <partition identifier> WITH TABLE <table name>
>
> partition identifier is one of PARTITION <name> or PARTITION FOR(...).
> The partition in the partition set 'becomes' the table <table name> and
> vice-versa. Essentially, we'd swap the relfilenodes. This means that we
> have to first ADD PARTITION then swap the table and the partition.
> Thoughts?
>
> MERGE
> -----
>
> You can merge and list partitions and any two range partitions:
>
> ... MERGE <partition id>, <partition id> [INTO PARTITION <partition name>]
>
> For range partitions:
>
> ... MERGE PARTITION FOR(date '2006_01_01'), PARTITION FOR(date '2007-01-01');
>
> For list partitions:
>
> ... MERGE PARTITION FOR(VALUES('CA', 'MA')
>
> This begs the question of why we have COALESCE for hash partitioning. I
> don't know, it just seems like the right thing since you can't merge two
> hash partitions together (well, you shouldn't want to).
>
> RENAME
> ------
>
> Rename a partition. We can use partition name or FOR clause.
>
> SPLIT
> -----
>
> Split is used to divide a partition in two. It is designed for list and
> range partitioning but I guess we could/should support hash. I need to
> think about that. For RANGE partitions:
>
> ... SPLIT <partition id> <AT-clause> [INTO (PARTITION <partition name1>,
> PARTITION <partition name2>)];
>
> AT clause specifies the point at which the partition is split in two:
>
> ... SPLIT PARTITION FOR(2000) AT 1000 INTO PARTITION (part1000,
> part2000)
>
> We might want ways to do this with unnamed partitions, it seems to me.
> Thoughts?
>
> For list:
>
> ... SPLIT PARTITION region_east AT( VALUES ('CT', 'MA', 'MD') )
> INTO
> (
> PARTITION region_east_1,
> PARTITION region_east_2
> );
>
> In this case, values from region_east specified in the AT() list are put in
> region_east_1 and the rest are put in region_east_2.
>
> I think a better way for supporting split with hash is via ADD. I'm sure
> some people think that ugly so I'd like feedback.
>
> TRUNCATE
> --------
>
> Truncate a specified partition:
>
> ... TRUNCATE PARTITION FOR ('2005-01-01')
>
> We could specify a name too.
>
> This will use truncate internally.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
--
Mike <ipso(at)snappymail(dot)ca>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Warren Turkal 2008-01-12 00:16:22 Re: Declarative partitioning grammar
Previous Message Ron Mayer 2008-01-11 23:42:20 Re: Declarative partitioning grammar