Re: Declarative partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Amit Langote <amitlangote09(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning
Date: 2016-04-18 05:02:36
Message-ID: 57146A6C.2010801@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi Ashutosh,

On 2016/04/15 18:48, Ashutosh Bapat wrote:
> With the new set of patches, I am getting following warnings but no
> compilation failures. Patches apply smoothly.
> partition.c:1216:21: warning: variable ‘form’ set but not used
> [-Wunused-but-set-variable]
> partition.c:1637:20: warning: variable ‘form’ set but not used
> [-Wunused-but-set-variable]

Ah, will fix.

> For creating partition the documentation seems to suggest the syntax
> create table t1_p1 partition of t1 for values {start {0} end {100}
> exclusive;
> but following syntax works
> create table t1_p1 partition of t1 for values start (0) end (100) exclusive;

Hmm, I see the following in docs:

...
and partition_bound_spec is:

FOR VALUES { list_spec | range_spec }

...

list_spec in FOR VALUES is:

{ IN ( expression [, ...] ) }

range_spec in FOR VALUES is:

{ START (lower-bound) [ INCLUSIVE | EXCLUSIVE ] |
END (upper-bound) [ INCLUSIVE | EXCLUSIVE ] |
START (lower-bound) [ INCLUSIVE | EXCLUSIVE ] END (upper-bound) [
INCLUSIVE | EXCLUSIVE ] }

By the way, I see that you are always specifying "exclusive" for end bound
in your examples. The specification is optional if that wasn't apparent.
Default (ie, without explicit specification) is [start, end).

>> I got rid of all the optimizer changes in the new version (except a line
>> or two). I did that by switching to storing parent-child relationships in
>> pg_inherits so that all the existing optimizer code for inheritance sets
>> works unchanged. Also, the implementation detail that required to put
>> only leaf partitions in the append list is also gone. Previously no
>> storage was allocated for partitioned tables (either root or any of the
>> internal partitions), so it was being done that way.
>>
>
> With this set of patches we are still flattening all the partitions.
> postgres=# create table t1 (a int, b int) partition by range(a);
> CREATE TABLE
> postgres=# create table t1_p1 partition of t1 for values start (0) end
> (100) exclusive partition by range(b);
> CREATE TABLE
> postgres=# create table t1_p1_p1 partition of t1_p1 for values start (0)
> end (100) exclusive;
> CREATE TABLE
> postgres=# explain verbose select * from t1;
> QUERY PLAN
> -------------------------------------------------------------------------
> Append (cost=0.00..32.60 rows=2262 width=8)
> -> Seq Scan on public.t1 (cost=0.00..0.00 rows=1 width=8)
> Output: t1.a, t1.b
> -> Seq Scan on public.t1_p1 (cost=0.00..0.00 rows=1 width=8)
> Output: t1_p1.a, t1_p1.b
> -> Seq Scan on public.t1_p1_p1 (cost=0.00..32.60 rows=2260 width=8)
> Output: t1_p1_p1.a, t1_p1_p1.b
> (7 rows)
> Retaining the partition hierarchy would help to push-down join across
> partition hierarchy effectively.

Yes, it's still a flattened append (basically what optimizer currently
creates for arbitrary inheritance trees). I didn't modify any of that yet.

>> Regarding 6, it seems to me that because Append does not have a associated
>> relid (like scan nodes have with scanrelid). Maybe we need to either fix
>> Append or create some enhanced version of Append which would also support
>> dynamic pruning.
>>
>
> Right, I think, Append might store the relid of the parent table as well as
> partitioning information at that level along-with the subplans.

For time being, I will leave this as yet unaddressed (I am thinking about
what is reasonable to do for this also considering Robert's comment). Is
that OK?

> Some more comments
> Would it be better to declare PartitionDescData as
> {
> int nparts;
> PartitionInfo *partinfo; /* array of partition information structures. */
> }

I think that might be better. Will do it the way you suggest.

> The new syntax allows CREATE TABLE to be specified as partition of an
> already partitioned table. Is it possible to do the same for CREATE FOREIGN
> TABLE? Or that's material for v2? Similarly for ATTACH PARTITION.

OK, I will address this in the next version. One question though: should
foreign table be only allowed to be leaf partitions (ie, no PARTITION BY
clause in CREATE FOREIGN TABLE ... PARTITION OF)? By the way, ATTACH
PARTITION only allows leaf partitions anyway.

Thanks a lot for the review!

Thanks,
Amt

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-04-18 05:15:21 Re: Support for N synchronous standby servers - take 2
Previous Message Craig Ringer 2016-04-18 04:52:17 Re: Confusing comment in pg_upgrade in regards to VACUUM FREEZE