Re: [PATCH] Automatic HASH and LIST partition creation

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amul Sul <sulamul(at)gmail(dot)com>
Subject: Re: [PATCH] Automatic HASH and LIST partition creation
Date: 2020-09-08 14:03:39
Message-ID: CALT9ZEECZacCRuSthkq5ivOysL5E--v6vr9=0CFouSjEKpbOow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> The patch lacks documentation, because I expect some details may change
> during discussion. Other than that, the feature is ready for review.
>
Hi, hackers!

From what I've read I see there is much interest in automatic partitions
creation. (Overall discussion on the topic is partitioned into two threads:
(1)
https://www.postgresql.org/message-id/alpine.DEB.2.21.1907150711080.22273%40lancre
and
(2)
https://www.postgresql.org/message-id/flat/7fec3abb-c663-c0d2-8452-a46141be6d4a(at)postgrespro(dot)ru
(current thread) )

There were many syntax proposals and finally, there is a patch realizing
one of them. So I'd like to review it.

The syntax proposed in the patch seems good enough for me and is in
accordance with one of the proposals in the discussion. Maybe I'd prefer
using the word AUTOMATICALLY/AUTO instead of CONFIGURATION with explicit
meaning that using this syntax we'd get already (automatically) created
partitions and don't need to create them manually, as in the existing state
of postgresql declarative partitioning.

CREATE TABLE tbl (i int) PARTITION BY HASH (i) AUTOMATICALLY (MODULUS
3); (partitions are created automatically)

vs

CREATE TABLE tbl (i int) PARTITION BY HASH (i); (partitions should be
created manually by use of PARTITION OF)

CREATE TABLE tbl (i char) PARTITION BY LIST (i) AUTOMATICALLY (VALUES
IN ('a', 'b'), ('c', 'd'), ('e','f') DEFAULT PARTITION tbl_default);

vs

CREATE TABLE tbl (i char) PARTITION BY LIST (i); (partitions should be
created manually by use of PARTITION OF)

I think this syntax can also be extended later with adding automatic
creation of RANGE partitions, with IMMEDIATE/DEFERRED for dynamic/on-demand
automatic partition creation, and with SUBPARTITION possibility.

But I don't have a strong preference for the word AUTOMATICALLY, moreover I
saw opposition to using AUTO at the top of the discussion. I suppose we can
go with the existing CONFIGURATION word.

If compare with existing declarative partitions, I think automatic creation
simplifies the process for the end-user and I'd vote for its committing
into Postgres. The patch is short and clean in code style. It has enough
comments Tests covering the new functionality are included. Yet it doesn't
have documentation and I'd suppose it's worth adding it. Even if there will
be syntax changes, I hope they will not be more than the replacement of
several words. Current syntax is described in the text of a patch.

The patch applies cleanly and installcheck-world is passed.

Some minor things:

I've got a compiler warning:
parse_utilcmd.c:4280:15: warning: unused variable 'lc' [-Wunused-variable]

When the number of partitions is over the maximum value of int32 the output
shows a generic syntax error. I don't think it is very important as it is
not the case someone will make deliberately, but maybe it's better to
output something like "Partitions number is more than the maximum supported
value"
create table test (i int, t text) partition by hash (i) configuration
(modulus 888888888888);
ERROR: syntax error at or near "888888888888"

One more piece of nitpicking. Probably we can go just with a mention in
documentation.
create table test (i int, t text) partition by hash (i) configuration
(modulus 8888);
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.

Typo:
+ /* Add statemets to create each partition after we create parent table */

Overall I see the patch almost ready for commit and I'd like to meet this
functionality in v14.

Tested it and see this feature very cool and much simpler to use compared
to declarative partitioning to date.

Thanks!
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-09-08 14:06:58 Re: Disk-based hash aggregate's cost model
Previous Message Amit Langote 2020-09-08 14:00:12 Re: [POC] Fast COPY FROM command for the table with foreign partitions