Re: Creating partitions automatically at least on HASH?

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Robert Haas <robertmhaas(at)gmail(dot)com>, Robert Eckhardt <reckhardt(at)pivotal(dot)io>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Creating partitions automatically at least on HASH?
Date: 2019-09-18 06:11:11
Message-ID: CA+HiwqH1j5LCvZjb+ak9jyi15q+D3sLEtdn25_syfOJKN80N_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Fabien, Rafia,

Thanks for starting this discussion.

On Tue, Aug 27, 2019 at 5:36 PM Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com> wrote:
> On Mon, 26 Aug 2019 at 19:46, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>> > I happen to start a similar discussion [1] being unaware of this one
>> > and there Ashutosh Sharma talked about interval partitioning in Oracle.
>> > Looking
>> > closely it looks like we can have this automatic partitioning more
>> > convenient by having something similar. Basically, it is creating
>> > partitions on demand or lazy partitioning.
>>
>> Yep, the "what" of dynamic partitioning is more or less straightforward,
>> along the line you are describing.
>>
>> For me there are really two questions:
>>
>> - having a extendable syntax, hence the mail I sent, which would cover
>> both automatic static & dynamic partitioning and their parameters,
>> given that we already have manual static, automatic static should
>> be pretty easy.
>>
>> - implementing the stuff, with limited performance impact if possible
>> for the dynamic case, which is non trivial.
>>
>> > To explain a bit more, let's take range partition for example, first
>> > parent table is created and it's interval and start and end values are
>> > specified and it creates only the parent table just like it works today.
>>
>> > Now, if there comes a insertion that does not belong to the existing (or
>> > any, in the case of first insertion) partition(s), then the
>> > corresponding partition is created,
>>
>> Yep. Now, you also have to deal with race conditions issues, i.e. two
>> parallel session inserting tuples that must create the same partition, and
>> probably you would like to avoid a deadlock.
>>
> Hmmm, that shouldn't be very hard. Postgres handles many such things and I think mostly by a mutex guarded shared memory structure. E.g. we can have a shared memory structure associated with the parent table holding the information of all the available partitions, and keep this structure guarded by mutex. Anytime a new partition has to be created the relevant information is first entered in this structure before actually creating it.

I like the Fabien's approach to focus on automatic creation of
partitions only "statically" at first, deferring any complex matters
of the "dynamic" counterpart to a later date. One advantage is that
we get to focus on the details of the UI for this feature, which has
complexities of its own. Speaking of which, how about the following
variant of the syntax that Fabien proposed earlier:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is:

PARTITIONS { IMMEDIATE | DEFERRED } USING (partition_descriptor)

where partition_descriptor is:

MODULUS integer | FROM (range_start) END (range_end) INTERVAL
(range_step) | list_values

where range_ start/end/step is:

(expr [,...])

and list_values is:

(expr [,...]) [, ....]

Note that list_values contains one parenthesized list per partition.
This is slightly different from what Robert suggested upthread in that
even a single value needs parentheses.

Automatic creation of multi-column range partitions seems a bit tricky
as thinking about a multi-column "interval" is tricky.

Needless to say, PARTITIONS DEFERRED will cause an unsupported feature
error in the first cut.

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-09-18 06:26:35 Re: Efficient output for integer types
Previous Message David Fetter 2019-09-18 05:51:42 Re: Efficient output for integer types