Re: Declarative partitioning

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Declarative partitioning
Date: 2016-01-17 00:47:21
Message-ID: CADkLM=fazD-GFZKmERQ1pObi8BiQb99+zaEj8BDW02FyptdWYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
>
> If we have a CREATE statement for each partition, how do we generalize
> that to partitions at different levels? For example, if we use something
> like the following to create a partition of parent_name:
>
> CREATE PARTITION partition_name OF parent_name FOR VALUES ...
> WITH ... TABLESPACE ...
>
> Do we then say:
>
> CREATE PARTITION subpartition_name OF partition_name ...
>

That's how I'd want it for partitions created after the initial partitioned
table is created.

I'd like to be able to identify the parent partition by it's own
partitioning parameters rather than name, like the way we can derive the
name of an index in ON CONFLICT. But I see no clean way to do that, and if
one did come up, we'd simply allow the user to replace
<partition_name>
with
table_name PARTITION partition_spec [...PARTITION partition_spec [
...PARTITION turtles_all_the_way_down]]).

Again, totally fine with forcing the maintenance script to know or discover
the name of the partition to be subpartitioned...for now.

> to create a level 2 partition (sub-partition) of parent_name? Obviously,
> as is readily apparent from the command, it is still a direct partition of
> partition_name for all internal purposes (consider partition list caching
> in relcache, recursive tuple routing, etc.) save some others.
>
> I ask that also because it's related to the choice of syntax to use to
> declare the partition key for the multi-level case. I'm considering the
> SUBPARTITION BY notation and perhaps we could generalize it to more than
> just 2 levels. So, for the above case, parent_name would have been created
> as:
>
> CREATE TABLE parent_name PARTITION BY ... SUBPARTITION BY ...

> Needless to say, when subpartition_name is created with the command we saw
> a moment ago, the root partitioned table would be locked. In fact, adding
> a partition anywhere in the hierarchy needs an exclusive lock on the root
> table. Also, partition rule (the FOR VALUES clause) would be validated
> against PARTITION BY or SUBPARTITION BY clause at the respective level.
>
> Although, I must admit I feel a little uneasy about the inherent asymmetry
> in using SUBPARTITION BY for key declaration whereas piggybacking CREATE
> PARTITION for creating sub-partitions. Is there a better way?
>

Provided that the syntax allows for N levels of partitioning, I don't care
if it's
PARTITION BY.., PARTITION BY..., PARTITION BY ...
or
PARTITION BY.., SUBPARTITION BY..., SUBPARTITION BY ...

The first is probably better for meta-coding purposes, but the second makes
it clear which partition layer is first.

>
> > As for the convenience syntax (if at all), how about:
> >
> > CREATE TABLE foo (
> > ...
> > )
> > PARTITION BY ... ON (...)
> > SUBPARTITION BY ... ON (...)
> > opt_partition_list;
> >
> > where opt_partition_list is:
> >
> > PARTITIONS (
> > partname FOR VALUES ... [WITH] [TABLESPACE] opt_subpart_list
> > [, ...]
> > )
> >
> > where opt_subpart_list is:
> >
> > SUBPARTITIONS (
> > subpartname FOR VALUES ... [WITH] [ TABLESPACE]
> > [, ...]
> > )
>
> Do we want this at all? It seems difficult to generalize this to
> multi-level hierarchy of more than 2 levels.
>

I want this.

Granted the syntax of a 3+ level partitioning would be cumbersome, but it
is what the user wanted, and the nested PARTITION/SUBPARTITION. In those
cases, the user might opt to not create more than the default first
subpartition to keep the syntax sane, or we might auto-generate default
partitions (with a VALUES clause of whatever "all values" is for that
datatype...again, this is an area where leveraging range types would be
most valuable).

> On one hand, I think to keep treating "partition hierarchies" as
> "inheritance hierachies" might have some issues. I am afraid that
> documented inheritance semantics may not be what we want to keep using for
> the new partitioned tables. By that, I mean all the user-facing behaviors
> where inheritance has some bearing. Should it also affect new partitioned
> tables? Consider whether inheritance semantics would render infeasible
> some of the things that we'd like to introduce for the new partitioned
> tables such as automatic tuple routing, or keep us from improving planner
> smarts and executor capabilities for partitioned tables over what we
> already have.
>

I feel that Automatic tuple routing should be considered they key benefit
of "real" partitions over inherited tables. Trigger maintenance is most of
the work of custom partitioning schemes, at least the ones I've written.

There's a great chance that not everyone cares right now about this part
> of the new partitioning but just want to put it out there. There are more
> contentious issues like the syntax, partitioning maintenance commands that
> we plan to support (now or later) and such.
>

What I've read so far addresses most of my concerns.

Still somewhat on my mind:

1. ability to describe partition bounds via range types, regardless of
whether the Automatic Tuple Routing uses those types internally.
2. syntax for splitting a partition in two, merging two adjacent partitions
(you probably touched on these earlier and I missed it or forgot).
3. ability to swap a partition with a table not currently associated with
the partitioned table.
4. The applicability of this syntax to materialized views, allowing us to
do REFRESH CONCURRENTLY a few parts at a time, or only refreshing the data
we know needs it.

Items 2 and 3 don't have to be implemented right away, as they're separate
ALTER commands. 4 is a pipe dream. With Item 1 I ask only that we don't
pick a syntax that prevents description via range types.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-01-17 02:09:26 Re: Some bugs in psql_complete of psql
Previous Message Andres Freund 2016-01-16 23:48:46 Re: exposing pg_controldata and pg_config as functions