Re: Creating partitions automatically at least on HASH?

From: Robert Eckhardt <reckhardt(at)pivotal(dot)io>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Creating partitions automatically at least on HASH?
Date: 2019-07-15 19:50:03
Message-ID: CAAtBm9VMK_8aRE3qjY4_a04ErH703Ni39O_kTcGafKuzUWARsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 15, 2019 at 10:54 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Mon, Jul 15, 2019 at 1:29 AM Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
> > Hello pgdevs,
> >
> > sorry if this has been already discussed, but G did not yield anything
> > convincing about that.
> >
> > While looking at HASH partitioning and creating a few ones, it occured to
> > me that while RANGE and LIST partitions cannot be guessed easily, it would
> > be easy to derive HASH partitioned table for a fixed MODULUS, e.g. with
> >
> > CREATE TABLE foo(...) PARTITION BY HASH AUTOMATIC (MODULUS 10);
> > -- or some other syntax
> >
> > Postgres could derive statically the 10 subtables, eg named foo_$0$ to
> > foo_$1$.
> >
> > That would not be a replacement for the feature where one may do something
> > funny and doubtful like (MODULUS 2 REMAINDER 0, MODULUS 4 REMAINDER 1,
> > MODULUS 4 REMAINDER 3).
> >
> > The same declarative approach could eventually be considered for RANGE
> > with a fixed partition duration and starting and ending points.
> >
> > This would be a relief on the longer path of dynamically creating
> > partitions, but with lower costs than a dynamic approach.
>
> Yeah, I think something like this would be reasonable, but I think
> that the best syntax is not really clear. We might want to look at
> how other systems handle this.

Greenplum has a syntax that covers some cases but not the hash case.

For range based partitions we have:

CREATE TABLE sales (id int, date date, amt decimal(10,2))
DISTRIBUTED BY (id)
PARTITION BY RANGE (date)
( START (date '2016-01-01') INCLUSIVE
END (date '2017-01-01') EXCLUSIVE
EVERY (INTERVAL '1 day') );

This is equivelant to the below so you can also declare and name each
partition individually. For example:

CREATE TABLE sales (id int, date date, amt decimal(10,2))
DISTRIBUTED BY (id)
PARTITION BY RANGE (date)
( PARTITION Jan16 START (date '2016-01-01') INCLUSIVE ,
PARTITION Feb16 START (date '2016-02-01') INCLUSIVE ,
PARTITION Mar16 START (date '2016-03-01') INCLUSIVE ,
PARTITION Apr16 START (date '2016-04-01') INCLUSIVE ,
PARTITION May16 START (date '2016-05-01') INCLUSIVE ,
PARTITION Jun16 START (date '2016-06-01') INCLUSIVE ,
PARTITION Jul16 START (date '2016-07-01') INCLUSIVE ,
PARTITION Aug16 START (date '2016-08-01') INCLUSIVE ,
PARTITION Sep16 START (date '2016-09-01') INCLUSIVE ,
PARTITION Oct16 START (date '2016-10-01') INCLUSIVE ,
PARTITION Nov16 START (date '2016-11-01') INCLUSIVE ,
PARTITION Dec16 START (date '2016-12-01') INCLUSIVE
END (date '2017-01-01') EXCLUSIVE );

You can do similar things with numeric

CREATE TABLE rank (id int, rank int, year int, gender
char(1), count int)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
( START (2006) END (2016) EVERY (1),
DEFAULT PARTITION extra );

ENUM

CREATE TABLE rank (id int, rank int, year int, gender
char(1), count int )
DISTRIBUTED BY (id)
PARTITION BY LIST (gender)
( PARTITION girls VALUES ('F'),
PARTITION boys VALUES ('M'),
DEFAULT PARTITION other );

Also it supports multilevel partitioning using a PARTITION TEMPLATE
and SUBPARTITION TEMPLATE. The partitioning template ensures that the
structure at every level is the same.

CREATE TABLE p3_sales (id int, year int, month int, day int,
region text)
DISTRIBUTED BY (id)
PARTITION BY RANGE (year)
SUBPARTITION BY RANGE (month)
SUBPARTITION TEMPLATE (
START (1) END (13) EVERY (1),
DEFAULT SUBPARTITION other_months )
SUBPARTITION BY LIST (region)
SUBPARTITION TEMPLATE (
SUBPARTITION usa VALUES ('usa'),
SUBPARTITION europe VALUES ('europe'),
SUBPARTITION asia VALUES ('asia'),
DEFAULT SUBPARTITION other_regions )
( START (2002) END (2012) EVERY (1),
DEFAULT PARTITION outlying_years );

-- Rob

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-07-15 19:50:16 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Tom Lane 2019-07-15 19:49:50 Re: POC: converting Lists into arrays