Re: [POC] hash partitioning

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, amul sul <sulamul(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-04-17 08:50:42
Message-ID: 20170417175042.161c71b4.nagata@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 14 Apr 2017 09:05:14 -0400
Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Apr 14, 2017 at 4:23 AM, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > On Thu, 13 Apr 2017 16:40:29 -0400
> > Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> On Fri, Mar 17, 2017 at 7:57 AM, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> >> > I also understanded that my design has a problem during pg_dump and
> >> > pg_upgrade, and that some information to identify the partition
> >> > is required not depending the command order. However, I feel that
> >> > Amul's design is a bit complicated with the rule to specify modulus.
> >> >
> >> > I think we can use simpler syntax, for example, as below.
> >> >
> >> > CREATE TABLE h1 PARTITION OF h FOR (0);
> >> > CREATE TABLE h2 PARTITION OF h FOR (1);
> >> > CREATE TABLE h3 PARTITION OF h FOR (2);
> >>
> >> I don't see how that can possibly work. Until you see all the table
> >> partitions, you don't know what the partitioning constraint for any
> >> given partition should be, which seems to me to be a fatal problem.
> >
> > If a partition has an id, the partitioning constraint can be written as
> >
> > hash_func(hash_key) % N = id
> >
> > wehre N is the number of paritions. Doesn't it work?
>
> Only if you know the number of partitions. But with your syntax,
> after seeing only the first of the CREATE TABLE .. PARTITION OF
> commands, what should the partition constraint be? It depends on how
> many more such commands appear later in the dump file, which you do
> not know at that point.

I thought that the partition constraint could be decided every
time a new partition is created or attached, and that it woule be
needed to relocate records automatically when the partition configuration
changes. However, I have come to think that the automatic relocation
might not be needed at this point.

>
> >> I agree that Amul's syntax - really, I proposed it to him - is not the
> >> simplest, but I think all the details needed to reconstruct the
> >> partitioning constraint need to be explicit. Otherwise, I'm pretty
> >> sure things we're going to have lots of problems that we can't really
> >> solve cleanly. We can later invent convenience syntax that makes
> >> common configurations easier to set up, but we should invent the
> >> syntax that spells out all the details first.
> >
> > I have a question about Amul's syntax. After we create partitions
> > as followings,
> >
> > create table foo (a integer, b text) partition by hash (a);
> > create table foo1 partition of foo with (modulus 2, remainder 0);
> > create table foo2 partition of foo with (modulus 2, remainder 1);
> >
> > we cannot create any additional partitions for the partition.
> >
> > Then, after inserting records into foo1 and foo2, how we can
> > increase the number of partitions?
>
> You can detach foo1, create two new partitions with modulus 4 and
> remainders 0 and 2, and move the data over from the old partition.
>
> I realize that's not as automated as you might like, but it's no worse
> than what is currently required for list and range partitioning when
> you split a partition. Someday we might build in tools to do that
> kind of data migration automatically, but right now we have none.

Thanks. I understood it. The automatic data migration feature
would be better to be implemented separately.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-04-17 08:58:07 Re: Variable substitution in psql backtick expansion
Previous Message Arthur Zakirov 2017-04-17 08:33:01 Re: Extracting GiST index structure stats?