Re: Declarative partitioning - another take

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning - another take
Date: 2016-09-02 05:38:51
Message-ID: CAFjFpRc9k5ctS3htaobXH4mrzN3ufXqJA2gupOt99Y5HNtUHnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's something I observed with your set of patches posted in June. I have
not checked the latest set of patches. So, if it's something fixed, please
ignore the mail and sorry for me being lazy.

prt1 is partitioned table and it shows following information with \d+

regression=# \d+ prt1
Partitioned table "public.prt1"
Column | Type | Modifiers | Storage | Stats target |
Description
--------+-------------------+-----------+----------+--------------+-------------
a | integer | | plain | |
b | integer | | plain | |
c | character varying | | extended | |
Partition Key: PARTITION BY RANGE (a)
Indexes:
"iprt1_a" btree (a)

Shouldn't we show all the partitions of this table and may be their ranges
of lists?

I found the partitions from EXPLAIN plan

regression=# explain verbose select * from prt1;
QUERY PLAN
-----------------------------------------------------------------------
Append (cost=0.00..6.00 rows=301 width=13)
-> Seq Scan on public.prt1 (cost=0.00..0.00 rows=1 width=40)
Output: prt1.a, prt1.b, prt1.c
-> Seq Scan on public.prt1_p1 (cost=0.00..2.25 rows=125 width=13)
Output: prt1_p1.a, prt1_p1.b, prt1_p1.c
-> Seq Scan on public.prt1_p3 (cost=0.00..1.50 rows=50 width=13)
Output: prt1_p3.a, prt1_p3.b, prt1_p3.c
-> Seq Scan on public.prt1_p2 (cost=0.00..2.25 rows=125 width=13)
Output: prt1_p2.a, prt1_p2.b, prt1_p2.c
(9 rows)

Then did \d+ on each of those to find their ranges

regression=# \d+ prt1_p1
Table "public.prt1_p1"
Column | Type | Modifiers | Storage | Stats target |
Description
--------+-------------------+-----------+----------+--------------+-------------
a | integer | | plain | |
b | integer | | plain | |
c | character varying | | extended | |
Partition Of: prt1 FOR VALUES START (0) END (250)
Indexes:
"iprt1_p1_a" btree (a)

regression=# \d+ prt1_p2
Table "public.prt1_p2"
Column | Type | Modifiers | Storage | Stats target |
Description
--------+-------------------+-----------+----------+--------------+-------------
a | integer | | plain | |
b | integer | | plain | |
c | character varying | | extended | |
Partition Of: prt1 FOR VALUES START (250) END (500)
Indexes:
"iprt1_p2_a" btree (a)

regression=# \d+ prt1_p3
Table "public.prt1_p3"
Column | Type | Modifiers | Storage | Stats target |
Description
--------+-------------------+-----------+----------+--------------+-------------
a | integer | | plain | |
b | integer | | plain | |
c | character varying | | extended | |
Partition Of: prt1 FOR VALUES START (500) END (600)
Indexes:
"iprt1_p3_a" btree (a)

As you will observe that the table prt1 can not have any row with a < 0 and
a > 600. But when I execute

regression=# explain verbose select * from prt1 where a > 1000000;
QUERY PLAN
------------------------------------------------------------------
Append (cost=0.00..0.00 rows=1 width=40)
-> Seq Scan on public.prt1 (cost=0.00..0.00 rows=1 width=40)
Output: prt1.a, prt1.b, prt1.c
Filter: (prt1.a > 1000000)
(4 rows)

it correctly excluded all the partitions, but did not exclude the parent
relation. I guess, we have enough information to exclude it. Probably, we
should add a check constraint on the parent which is OR of the check
constraints on all the partitions. So there are two problems here

1. \d+ doesn't show partitions - this is probably reported earlier, I don't
remember.
2. A combination of constraints on the partitions should be applicable to
the parent. We aren't doing that.

On Wed, Aug 10, 2016 at 4:39 PM, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
> wrote:

> Hi,
>
> Attached is the latest set of patches to implement declarative
> partitioning. There is already a commitfest entry for the same:
> https://commitfest.postgresql.org/10/611/
>
> The old discussion is here:
> http://www.postgresql.org/message-id/55D3093C.5010800@lab.ntt.co.jp/
>
> Attached patches are described below:
>
> 0001-Catalog-and-DDL-for-partition-key.patch
> 0002-psql-and-pg_dump-support-for-partitioned-tables.patch
>
> These patches create the infrastructure and DDL for partitioned
> tables.
>
> In addition to a catalog for storing the partition key information, this
> adds a new relkind to pg_class.h. PARTITION BY clause is added to CREATE
> TABLE. Tables so created are RELKIND_PARTITIONED_REL relations which are
> to be special in a number of ways, especially with regard to their
> interactions with regular table inheritance features.
>
> PARTITION BY RANGE ({ column_name | ( expression ) } [ opclass ] [, ...])
> PARTITION BY LIST ({ column_name | ( expression ) } [ opclass ])
>
>
> 0003-Catalog-and-DDL-for-partition-bounds.patch
> 0004-psql-and-pg_dump-support-for-partitions.patch
>
> These patches create the infrastructure and DDL for partitions.
>
> Parent-child relationships of a partitioned table and its partitions are
> managed behind-the-scenes with inheritance. That means there is a
> pg_inherits entry and attributes, constraints, etc. are marked with
> inheritance related information appropriately. However this case differs
> from a regular inheritance relationship in a number of ways. While the
> regular inheritance imposes certain restrictions on what elements a
> child's schema is allowed to contain (both at creation time and
> after-the-fact), the partitioning related code imposes further
> restrictions. For example, while regular inheritance allows a child to
> contain its own columns, the partitioning code disallows that. Stuff like
> NO INHERIT marking on check constraints, ONLY are ignored by the the
> partitioning code.
>
> Partition DDL includes both a way to create new partition and "attach" an
> existing table as a partition of parent partitioned table. Attempt to
> drop a partition using DROP TABLE causes an error. Instead a partition
> needs first to be "detached" from parent partitioned table. On the other
> hand, dropping the parent drops all the partitions if CASCADE is specified.
>
> CREATE TABLE partition_name
> PARTITION OF parent_table [ (
> { column_name WITH OPTIONS [ column_constraint [ ... ] ]
> | table_constraint }
> [, ... ]
> ) ] partition_bound_spec
> [ PARTITION BY {RANGE | LIST} ( { column_name | ( expression ) } [ opclass
> ] [, ...] )
>
> CREATE FOREIGN TABLE [ IF NOT EXISTS ] table_name
> PARTITION OF parent_table [ (
> { column_name WITH OPTIONS [ column_constraint [ ... ] ]
> | table_constraint }
> [, ... ]
> ) ] partition_bound_spec
> SERVER server_name
> [ OPTIONS ( option 'value' [, ... ] ) ]
>
> ALTER TABLE parent ATTACH PARTITION partition_name partition_bound_spec [
> VALIDATE | NO VALIDATE ]
>
> ALTER TABLE parent DETACH PARTITION partition_name
>
> partition_bound_spec is:
>
> FOR VALUES { list_spec | range_spec }
>
> list_spec in FOR VALUES is:
>
> IN ( expression [, ...] )
>
> range_spec in FOR VALUES is:
>
> START lower-bound [ INCLUSIVE | EXCLUSIVE ] END upper-bound [ INCLUSIVE |
> EXCLUSIVE ]
>
> where lower-bound and upper-bound are:
>
> { ( expression [, ...] ) | UNBOUNDED }
>
> expression can be a string literal, a numeric literal or NULL.
>
> Note that the one can specify PARTITION BY when creating a partition
> itself. That is to allow creating multi-level partitioned tables.
>
>
> 0005-Teach-a-few-places-to-use-partition-check-constraint.patch
>
> A partition's bound implicitly constrains the values that are allowed in
> the partition key of its rows. The same can be applied to partitions when
> inserting data *directly* into them to make sure that only the correct
> data is allowed in (if a tuple has been routed from the parent, that
> becomes unnecessary). To that end, ExecConstraints() now includes the
> above implicit check constraint in the list of constraints it enforces.
>
> Further, to enable constraint based partition exclusion on partitioned
> tables, the planner code includes in its list of constraints the above
> implicitly defined constraints. This arrangement is temporary however and
> will be rendered unnecessary when we implement special data structures and
> algorithms within the planner in future versions of this patch to use
> partition metadata more effectively for partition exclusion.
>
> Note that the "constraints" referred to above are not some on-disk
> structures but those generated internally on-the-fly when requested by a
> caller.
>
> 0006-Introduce-a-PartitionTreeNode-data-structure.patch
> 0007-Tuple-routing-for-partitioned-tables.patch
>
> These patches enable routing of tuples inserted into a partitioned table
> to one of its leaf partitions. It applies to both COPY FROM and INSERT.
> First of these patches introduces a data structure that provides a
> convenient means for the tuple routing code to step down a partition tree
> one level at a time. The second one modifies copy.c and executor to
> implement actual tuple routing. When inserting into a partition, its row
> constraints and triggers are applied. Note that the partition's
> constraints also include the constraints defined on the parent. This
> arrangements means however that the parent's triggers are not currently
> applied.
>
> Updates are handled like they are now for inheritance sets, however, if an
> update makes a row change partition, an error will be thrown.
>
> 0008-Update-DDL-Partitioning-chapter.patch
>
> This patch updates the partitioning section in the DDL chapter to reflect
> the new methods made available for creating and managing partitioned table
> and its partitions. Especially considering that it is no longer necessary
> to define CHECK constraints and triggers/rules manually for constraint
> exclusion and tuple routing, respectively.
>
> TODO (in short term):
> * Add more regression tests and docs
> * Add PartitionOptInfo and use it to perform partition pruning more
> effectively (the added infrastructure should also help pairwise joins
> patch proposed by Ashutosh Bapat [1])
> * Fix internal representation of list partition bounds to be more efficient
>
>
> Thanks,
> Amit
>
> [1]
> https://www.postgresql.org/message-id/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXn
> a-urjm_UY9BqXj%3DEaDTSA%40mail.gmail.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-09-02 05:43:27 Re: [Patch] RBTree iteration interface improvement
Previous Message Amit Kapila 2016-09-02 03:55:35 Re: Hash Indexes