Re: Creating partitions automatically at least on HASH?

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Robert Eckhardt <reckhardt(at)pivotal(dot)io>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Creating partitions automatically at least on HASH?
Date: 2019-08-26 17:46:04
Message-ID: alpine.DEB.2.21.1908261935350.9896@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Rafia,

>> CREATE TABLE Stuff (...)
>> PARTITION BY [HASH | RANGE | LIST] (…)
>> DO NONE -- this is the default
>> DO [IMMEDIATE|DEFERRED] USING (…)
>>
>> Where the USING part would be generic keword value pairs, eg:
>>
>> For HASH: (MODULUS 8) and/or (NPARTS 10)
>>
>> For RANGE: (START '1970-01-01', STOP '2020-01-01', INCREMENT '1 year')
>> and/or (START 1970, STOP 2020, NPARTS 50)
>>
>> And possibly for LIST: (IN (…), IN (…), …), or possibly some other
>> keyword.
>>
>> The "DEFERRED" could be used as an open syntax for dynamic partitioning,
>> if later someone would feel like doing it.
>>
> ISTM that "USING" is better than "WITH" because WITH is already used
>> specifically for HASH and other optional stuff in CREATE TABLE.
>>
>> The text constant would be interpreted depending on the partitioning
>> expression/column type.
>>
>> Any opinion about the overall approach?

> I happen to start a similar discussion [1] being unaware of this one
> and there Ashutosh Sharma talked about interval partitioning in Oracle.
> Looking
> closely it looks like we can have this automatic partitioning more
> convenient by having something similar. Basically, it is creating
> partitions on demand or lazy partitioning.

Yep, the "what" of dynamic partitioning is more or less straightforward,
along the line you are describing.

For me there are really two questions:

- having a extendable syntax, hence the mail I sent, which would cover
both automatic static & dynamic partitioning and their parameters,
given that we already have manual static, automatic static should
be pretty easy.

- implementing the stuff, with limited performance impact if possible
for the dynamic case, which is non trivial.

> To explain a bit more, let's take range partition for example, first
> parent table is created and it's interval and start and end values are
> specified and it creates only the parent table just like it works today.

> Now, if there comes a insertion that does not belong to the existing (or
> any, in the case of first insertion) partition(s), then the
> corresponding partition is created,

Yep. Now, you also have to deal with race conditions issues, i.e. two
parallel session inserting tuples that must create the same partition, and
probably you would like to avoid a deadlock.

> I think it is extensible to other partitioning schemes as well. Also it
> is likely to have a positive impact on the queries, because there will
> be required partitions only and would not require to educate
> planner/executor about many empty partitions.

Yep, but it creates other problems to solve…

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-08-26 17:46:41 Re: Proposal: Better generation of values in GENERATED columns.
Previous Message Laurenz Albe 2019-08-26 17:43:45 Re: Procedure support improvements