Re: Creating partitions automatically at least on HASH?

From: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Robert Eckhardt <reckhardt(at)pivotal(dot)io>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Creating partitions automatically at least on HASH?
Date: 2019-08-26 09:31:28
Message-ID: CA+FpmFdA-SM2toaLd+BJaUnkVXz+5FzXcX2qtmgNTFs74m93Xw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 18 Aug 2019 at 11:33, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:

>
> Hello Robert & Robert,
>
> > - no partitions are created immediately (current case)
> > but will have to be created manually later
> >
> > - static partitions are created automatically, based on provided
> > parameters
> >
> > - dynamic partitions will be created later, when needed, based
> > on provided parameters again.
> >
> > Even if all that is not implemented immediately.
> >
> >> We need something that will let you specify just a modulus for hash
> >> partitions, a start, end, and interval for range partitions, and a list
> of
> >> bounds for list partitions. If we're willing to create a new keyword,
> we
> >> could make PARTITIONS a keyword. Then:
> >>
> >> PARTITION BY HASH (whatever) PARTITIONS 8
> >
> > I think that it should reuse already existing keywords, i.e. MODULUS
> should
> > appear somewhere.
> >
> > Maybe:
> >
> > ... PARTITION BY HASH (whatever)
> > [ CREATE [IMMEDIATE | DEFERRED] PARTITIONS (MODULUS 8) |
> > NOCREATE or maybe NO CREATE ];
>
> I have given a small go at the parser part of that.
>
> There are 3 types of partitions with 3 dedicated syntax structures to
> handle their associated parameters (WITH …, FROM … TO …, IN …). ISTM that
> it is a "looks good from far away" idea, but when trying to extend that it
> is starting to be a pain. If a 4th partition type is added, should it be
> yet another syntax? So I'm looking for an generic and extensible syntax
> that could accomodate all cases for automatic creation of partitions.
>
> Second problem, adding a "CREATE" after "PARTITION BY … (…)" create
> shift-reduce conflicts with potential other CREATE TABLE option
> specification syntax. Not sure which one, but anyway. So the current
> generic syntax I'm considering is using "DO" as a trigger to start the
> optional automatic partition creation stuff:
>
> CREATE TABLE Stuff (...)
> PARTITION BY [HASH | RANGE | LIST] (…)
> DO NONE -- this is the default
> DO [IMMEDIATE|DEFERRED] USING (…)
>
> Where the USING part would be generic keword value pairs, eg:
>
> For HASH: (MODULUS 8) and/or (NPARTS 10)
>
> For RANGE: (START '1970-01-01', STOP '2020-01-01', INCREMENT '1 year')
> and/or (START 1970, STOP 2020, NPARTS 50)
>
> And possibly for LIST: (IN (…), IN (…), …), or possibly some other
> keyword.
>
> The "DEFERRED" could be used as an open syntax for dynamic partitioning,
> if later someone would feel like doing it.
>
ISTM that "USING" is better than "WITH" because WITH is already used
> specifically for HASH and other optional stuff in CREATE TABLE.
>
> The text constant would be interpreted depending on the partitioning
> expression/column type.
>
> Any opinion about the overall approach?
>
>
> I happen to start a similar discussion [1] being unaware of this one and
there Ashutosh Sharma talked about interval partitioning in Oracle. Looking
closely it looks like we can have this automatic partitioning more
convenient by having something similar. Basically, it is creating
partitions on demand or lazy partitioning. To explain a bit more, let's
take range partition for example, first parent table is created and it's
interval and start and end values are specified and it creates only the
parent table just like it works today. Now, if there comes a insertion
that does not belong to the existing (or any, in the case of first
insertion) partition(s), then the corresponding partition is created, I
think it is extensible to other partitioning schemes as well. Also it is
likely to have a positive impact on the queries, because there will be
required partitions only and would not require to educate planner/executor
about many empty partitions.

[1]
https://www.postgresql.org/message-id/flat/20190820205005.GA25823%40alvherre.pgsql#c67245b98e2cfc9c3bd261f134d05368

--
Regards,
Rafia Sabih

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2019-08-26 09:32:48 A problem about partitionwise join
Previous Message movead.li@highgo.ca 2019-08-26 09:10:59 Re: Re: Email to hackers for test coverage