Re: Declarative partitioning

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Amit Langote <amitlangote09(at)gmail(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning
Date: 2016-04-15 09:46:53
Message-ID: CAFjFpRdLbhUD7N0GY+DcAez_yPetnVZ_rtwfQL1KoiZVpZ_06g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

With the new set of patches, I am getting following warnings but no
compilation failures. Patches apply smoothly.
partition.c:1216:21: warning: variable ‘form’ set but not used
[-Wunused-but-set-variable]
partition.c:1637:20: warning: variable ‘form’ set but not used
[-Wunused-but-set-variable]

For creating partition the documentation seems to suggest the syntax
create table t1_p1 partition of t1 for values {start {0} end {100}
exclusive;
but following syntax works
create table t1_p1 partition of t1 for values start (0) end (100) exclusive;

> > 5. In expand_partitioned_rtentry(), instead of finding all the leaf level
> > relations, it might be better to build the RTEs, RelOptInfo and paths for
> > intermediate relations as well. This helps in partition pruning. In your
> > introductory write-up you have mentioned this. It might be better if v1
> > includes this change, so that the partition hierarchy is visible in
> EXPLAIN
> > output as well.
> >
> > 6. Explain output of scan on a partitioned table shows Append node with
> > individual table scans as sub-plans. May be we should annotate Append
> node
> > with
> > the name of the partitioned table to make EXPLAIN output more readable.
>
> I got rid of all the optimizer changes in the new version (except a line
> or two). I did that by switching to storing parent-child relationships in
> pg_inherits so that all the existing optimizer code for inheritance sets
> works unchanged. Also, the implementation detail that required to put
> only leaf partitions in the append list is also gone. Previously no
> storage was allocated for partitioned tables (either root or any of the
> internal partitions), so it was being done that way.
>

With this set of patches we are still flattening all the partitions.
postgres=# create table t1 (a int, b int) partition by range(a);
CREATE TABLE
postgres=# create table t1_p1 partition of t1 for values start (0) end
(100) exclusive partition by range(b);
CREATE TABLE
postgres=# create table t1_p1_p1 partition of t1_p1 for values start (0)
end (100) exclusive;
CREATE TABLE
postgres=# explain verbose select * from t1;
QUERY PLAN
-------------------------------------------------------------------------
Append (cost=0.00..32.60 rows=2262 width=8)
-> Seq Scan on public.t1 (cost=0.00..0.00 rows=1 width=8)
Output: t1.a, t1.b
-> Seq Scan on public.t1_p1 (cost=0.00..0.00 rows=1 width=8)
Output: t1_p1.a, t1_p1.b
-> Seq Scan on public.t1_p1_p1 (cost=0.00..32.60 rows=2260 width=8)
Output: t1_p1_p1.a, t1_p1_p1.b
(7 rows)
Retaining the partition hierarchy would help to push-down join across
partition hierarchy effectively.

>
> Regarding 6, it seems to me that because Append does not have a associated
> relid (like scan nodes have with scanrelid). Maybe we need to either fix
> Append or create some enhanced version of Append which would also support
> dynamic pruning.
>

Right, I think, Append might store the relid of the parent table as well as
partitioning information at that level along-with the subplans.

Some more comments:
1. Would it be better to declare PartitionDescData as
{
int nparts;
PartitionInfo *partinfo; /* array of partition information structures. */
}

2. The new syntax allows CREATE TABLE to be specified as partition of an
already partitioned table. Is it possible to do the same for CREATE FOREIGN
TABLE? Or that's material for v2? Similarly for ATTACH PARTITION.

3. PartitionKeyData contains KeyTypeCollInfo, whose contents can be
obtained by calling functions exprType, exprTypemod on partexprs. Why do we
need to store that information as a separate member?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-04-15 09:48:43 Re: Declarative partitioning
Previous Message Masahiko Sawada 2016-04-15 08:36:57 Re: Support for N synchronous standby servers - take 2