Re: Declarative partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Declarative partitioning
Date: 2016-01-22 04:41:54
Message-ID: 56A1B312.4000608@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi,

On 2016/01/17 9:47, Corey Huinker wrote:
>> If we have a CREATE statement for each partition, how do we generalize
>> that to partitions at different levels? For example, if we use something
>> like the following to create a partition of parent_name:
>>
>> CREATE PARTITION partition_name OF parent_name FOR VALUES ...
>> WITH ... TABLESPACE ...
>>
>> Do we then say:
>>
>> CREATE PARTITION subpartition_name OF partition_name ...
>
> That's how I'd want it for partitions created after the initial partitioned
> table is created.
>
> I'd like to be able to identify the parent partition by it's own
> partitioning parameters rather than name, like the way we can derive the
> name of an index in ON CONFLICT. But I see no clean way to do that, and if
> one did come up, we'd simply allow the user to replace
> <partition_name>
> with
> table_name PARTITION partition_spec [...PARTITION partition_spec [
> ...PARTITION turtles_all_the_way_down]]).
>
> Again, totally fine with forcing the maintenance script to know or discover
> the name of the partition to be subpartitioned...for now.

I'm thinking of going with this last option.

So for now, you create an empty partitioned table specifying all the
partition keys without being able to define any partitions in the same
statement. Partitions (and partitions thereof, if any) will be created
using CREATE PARTITION statements, one for each.

>> to create a level 2 partition (sub-partition) of parent_name? Obviously,
>> as is readily apparent from the command, it is still a direct partition of
>> partition_name for all internal purposes (consider partition list caching
>> in relcache, recursive tuple routing, etc.) save some others.
>>
>> I ask that also because it's related to the choice of syntax to use to
>> declare the partition key for the multi-level case. I'm considering the
>> SUBPARTITION BY notation and perhaps we could generalize it to more than
>> just 2 levels. So, for the above case, parent_name would have been created
>> as:
>>
>> CREATE TABLE parent_name PARTITION BY ... SUBPARTITION BY ...
>
>> Needless to say, when subpartition_name is created with the command we saw
>> a moment ago, the root partitioned table would be locked. In fact, adding
>> a partition anywhere in the hierarchy needs an exclusive lock on the root
>> table. Also, partition rule (the FOR VALUES clause) would be validated
>> against PARTITION BY or SUBPARTITION BY clause at the respective level.
>>
>> Although, I must admit I feel a little uneasy about the inherent asymmetry
>> in using SUBPARTITION BY for key declaration whereas piggybacking CREATE
>> PARTITION for creating sub-partitions. Is there a better way?
>
> Provided that the syntax allows for N levels of partitioning, I don't care
> if it's
> PARTITION BY.., PARTITION BY..., PARTITION BY ...
> or
> PARTITION BY.., SUBPARTITION BY..., SUBPARTITION BY ...
>
> The first is probably better for meta-coding purposes, but the second makes
> it clear which partition layer is first.

I prefer second one too, for clarity. In any case, the PARTITION /
SUBPARTITION BY clauses are ordered. That is, successive clauses specify
partition key for corresponding successive levels of the hierarchy.

> There's a great chance that not everyone cares right now about this part
>> of the new partitioning but just want to put it out there. There are more
>> contentious issues like the syntax, partitioning maintenance commands that
>> we plan to support (now or later) and such.
>>
>
> What I've read so far addresses most of my concerns.
>
> Still somewhat on my mind:
>
> 1. ability to describe partition bounds via range types, regardless of
> whether the Automatic Tuple Routing uses those types internally.

Specifying range partitioning bound as PARTITION FOR RANGE <range-literal>
sounds like it offers some flexibility, which can be seen as a good thing.
But it tends to make internal logic slightly complicated.

Whereas, saying PARTITION FOR VALUES LESS THAN (max1, max2, ...) is
notationally simpler still and easier to work with internally. Also, there
will be no confusion about exclusivity of the bound if we document it so.

> 2. syntax for splitting a partition in two, merging two adjacent partitions
> (you probably touched on these earlier and I missed it or forgot).

Maintenance operations like split, merge would not be the first cut.

> 3. ability to swap a partition with a table not currently associated with
> the partitioned table.

EXCHANGE PARTITION-like ability is something on my list. There are a few
hurdles to get there. Especially, source table would have to be physical
tuple descriptor (ie, including dropped columns) compatible with the
partitioned table given some other parts of the design (I touched on those
in my earlier message).

> 4. The applicability of this syntax to materialized views, allowing us to
> do REFRESH CONCURRENTLY a few parts at a time, or only refreshing the data
> we know needs it.
>
> Items 2 and 3 don't have to be implemented right away, as they're separate
> ALTER commands. 4 is a pipe dream. With Item 1 I ask only that we don't
> pick a syntax that prevents description via range types.

Thanks a lot for taking time to comment.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2016-01-22 04:56:26 Re: Releasing in September
Previous Message Haribabu Kommi 2016-01-22 04:25:26 Re: Parallel Aggregate