Re: Declarative partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning
Date: 2016-02-24 08:11:04
Message-ID: 56CD6598.4090302@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/02/20 5:06, Corey Huinker wrote:
> On Thu, Feb 18, 2016 at 12:41 AM, Amit Langote wrote:
>
>> START [ EXCL ] (startval) END [ INCL ] (endval)
>>
>> That is, in range type notation, '[startval, endval)' is the default
>> behavior. So for each partition, there is at least the following pieces of
>> metadata:
>>
>
> This is really close, and if it is what we ended up with we would be able
> to use it.
>
> I suggest that the range notation can be used even when no suitable range
> type exists.
>
> I assume the code for parsing a range spec regardless of data type already
> exists, but in case it doesn't, take a range spec of unknown type:
>
> [x,y)
>
> x and y are either going to be raw strings or doublequoted strings with
> possible doublequote escapes, each of which would be coercible into the the
> type of the partition column.
>
> In other words, if your string values were 'blah , blah ' and 'fabizzle',
> the [) range spec would be ["blah , blah ",fabizzle).

Hm, I see. How about multi-column keys? Do we care enough about that use
case? I don't see a nice-enough-looking range literal for such keys.
Consider for instance,

With the partitioned table defined as:

CREATE TABLE foo(c1 char(2), c2 char(2)) PARTITION BY RANGE (c1, c2);

where we'd use LESS THAN as:

CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES LESS THAN ('b', '2');
CREATE TABLE foo_ax2x PARTITION OF foo FOR VALUES LESS THAN ('b', '3');
CREATE TABLE foo_ax3x PARTITION OF foo FOR VALUES LESS THAN ('b', '4');

CREATE TABLE foo_bx1x PARTITION OF foo FOR VALUES LESS THAN ('c', '2');
CREATE TABLE foo_bx2x PARTITION OF foo FOR VALUES LESS THAN ('c', '3');
CREATE TABLE foo_bx3x PARTITION OF foo FOR VALUES LESS THAN ('c', '4');

I guess it would be the following with the new range spec:

CREATE TABLE foo_ax1x PARTITION OF foo FOR VALUES ['b', 'b'], ['1', '2');
CREATE TABLE foo_ax2x PARTITION OF foo FOR VALUES ['b', 'b'], ['2', '3');
CREATE TABLE foo_ax3x PARTITION OF foo FOR VALUES ['b', 'b'], ['3', '4');

CREATE TABLE foo_bx1x PARTITION OF foo FOR VALUES ['c', 'c'], ['1', '2');
CREATE TABLE foo_bx2x PARTITION OF foo FOR VALUES ['c', 'c'], ['2', '3');
CREATE TABLE foo_bx3x PARTITION OF foo FOR VALUES ['c', 'c'], ['3', '4');

The first column does not unambiguously select a partition with such keys.
Now I can imagine that it is possible to "emulate" such a multi-column
range key with k columns by k levels of range-range-* partitioning:

CREATE TABLE foo(c1 char(2), c2 char(2)) PARTITION BY RANGE (c1);

CREATE TABLE foo_ax PARTITION OF foo
FOR VALUES ['b, 'c') PARTITION BY RANGE (c2);

CREATE TABLE foo_ax1x PARTITION OF foo_ax FOR VALUES ['1', '2');
CREATE TABLE foo_ax2x PARTITION OF foo_ax FOR VALUES ['2', '3');
CREATE TABLE foo_ax3x PARTITION OF foo_ax FOR VALUES ['3', '4');

CREATE TABLE foo_bx PARTITION OF foo
FOR VALUES ['b, 'c') PARTITION BY RANGE (c2);

CREATE TABLE foo_bx1x PARTITION OF foo_bx FOR VALUES ['1', '2');
CREATE TABLE foo_bx2x PARTITION OF foo_bx FOR VALUES ['2', '3');
CREATE TABLE foo_bx3x PARTITION OF foo_bx FOR VALUES ['3', '4');

But IIRC, I have been contradicted once before regarding whether two ways
are exactly the same.

> Using regular range specs syntax also allows for the range to be unbounded
> in either or both directions, which is a possibility, especially in newer
> tables where the expected distribution of data is unknown.

We would want to also think about what subset of many permutations of this
syntax to accept range specs for new partitions. Mostly to preserve the
non-overlapping invariant and I think it would also be nice to prevent gaps.

Consider that once we create:

PARTITION FOR VALUES [current_date,);

Now to create a new partition starting at later date, we have to have a
"split partition" feature which would scan the above partition to
distribute the existing data rows appropriately to the resulting two
partitions. Right?

IOW, one shouldn't create an unbounded partition if more partitions in the
unbounded direction are expected to be created. It would be OK for
unbounded partitions to be on the lower end most of the times.

> p.s. Sorry I haven't been able to kick the tires just yet. We have a very
> good use case for this, it's just a matter of getting a machine and the
> time to devote to it.

I would appreciate it. You could wait a little more for my next
submission which will contain some revisions to the tuple routing code.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-02-24 08:37:57 Re: Support for N synchronous standby servers - take 2
Previous Message Thomas Munro 2016-02-24 07:29:58 Re: postgres_fdw vs. force_parallel_mode on ppc