Re: [POC] hash partitioning

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-03-01 10:07:30
Message-ID: 20170301190730.732b6fac.nagata@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 1 Mar 2017 10:30:09 +0530
Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com> wrote:

> On Tue, Feb 28, 2017 at 8:03 PM, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
> > Hi all,
> >
> > Now we have a declarative partitioning, but hash partitioning is not
> > implemented yet. Attached is a POC patch to add the hash partitioning
> > feature. I know we will need more discussions about the syntax and other
> > specifications before going ahead the project, but I think this runnable
> > code might help to discuss what and how we implement this.
> >
> > * Description
> >
> > In this patch, the hash partitioning implementation is basically based
> > on the list partitioning mechanism. However, partition bounds cannot be
> > specified explicitly, but this is used internally as hash partition
> > index, which is calculated when a partition is created or attached.
> >
> > The tentative syntax to create a partitioned table is as bellow;
> >
> > CREATE TABLE h (i int) PARTITION BY HASH(i) PARTITIONS 3 USING hashint4;
> >
> > The number of partitions is specified by PARTITIONS, which is currently
> > constant and cannot be changed, but I think this is needed to be changed in
> > some manner. A hash function is specified by USING. Maybe, specifying hash
> > function may be ommitted, and in this case, a default hash function
> > corresponding to key type will be used.
> >
> > A partition table can be create as bellow;
> >
> > CREATE TABLE h1 PARTITION OF h;
> > CREATE TABLE h2 PARTITION OF h;
> > CREATE TABLE h3 PARTITION OF h;
> >
> > FOR VALUES clause cannot be used, and the partition bound is
> > calclulated automatically as partition index of single integer value.
> >
> > When trying create partitions more than the number specified
> > by PARTITIONS, it gets an error.
> >
> > postgres=# create table h4 partition of h;
> > ERROR: cannot create hash partition more than 3 for h
> >
> > An inserted record is stored in a partition whose index equals
> > abs(hashfunc(key)) % <number_of_partitions>. In the above
> > example, this is abs(hashint4(i))%3.
> >
> > postgres=# insert into h (select generate_series(0,20));
> > INSERT 0 21
> >
> > postgres=# select *,tableoid::regclass from h;
> > i | tableoid
> > ----+----------
> > 0 | h1
> > 1 | h1
> > 2 | h1
> > 4 | h1
> > 8 | h1
> > 10 | h1
> > 11 | h1
> > 14 | h1
> > 15 | h1
> > 17 | h1
> > 20 | h1
> > 5 | h2
> > 12 | h2
> > 13 | h2
> > 16 | h2
> > 19 | h2
> > 3 | h3
> > 6 | h3
> > 7 | h3
> > 9 | h3
> > 18 | h3
> > (21 rows)
> >
> >
> This is good, I will have closer look into the patch, but here are
> few quick comments.

Thanks. I'm looking forward to your comments.

>
> - CREATE HASH partition syntax adds two new keywords and ideally
> we should try to avoid adding additional keywords. Also I can see that
> HASH keyword been added, but I don't see any use of newly added
> keyword in gram.y.

Yes, you are right. HASH keyword is not necessary. I'll remove it
from the patch.

>
> - Also I didn't like the idea of fixing number of partitions during the
> CREATE
> TABLE syntax. Thats something that needs to be able to changes.

I agree. The number specified by PARTIONS should be the *initial* number
of partitions and this should be abelt to be changed. I'm investigating
the way.

>
>
>
> > * Todo / discussions
> >
> > In this patch, we cannot change the number of partitions specified
> > by PARTITIONS. I we can change this, the partitioning rule
> > (<partition index> = abs(hashfunc(key)) % <number_of_partitions>)
> > is also changed and then we need reallocatiing records between
> > partitions.
> >
> > In this patch, user can specify a hash function USING. However,
> > we migth need default hash functions which are useful and
> > proper for hash partitioning.
> >
>
> +1
>
> - With fixing default hash function and not specifying number of partitions
> during CREATE TABLE - don't need two new additional columns into
> pg_partitioned_table catalog.

I think the option to specify a hash function is needed because
user may want to use a user-defined hash function for some reasons,
for example, when a user-defined type is used as a partition key.

>
>
> > Currently, even when we issue SELECT query with a condition,
> > postgres looks into all partitions regardless of each partition's
> > constraint, because this is complicated such like "abs(hashint4(i))%3 = 0".
> >
> > postgres=# explain select * from h where i = 10;
> > QUERY PLAN
> > ----------------------------------------------------------
> > Append (cost=0.00..125.62 rows=40 width=4)
> > -> Seq Scan on h (cost=0.00..0.00 rows=1 width=4)
> > Filter: (i = 10)
> > -> Seq Scan on h1 (cost=0.00..41.88 rows=13 width=4)
> > Filter: (i = 10)
> > -> Seq Scan on h2 (cost=0.00..41.88 rows=13 width=4)
> > Filter: (i = 10)
> > -> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4)
> > Filter: (i = 10)
> > (9 rows)
> >
> > However, if we modify a condition into a same expression
> > as the partitions constraint, postgres can exclude unrelated
> > table from search targets. So, we might avoid the problem
> > by converting the qual properly before calling predicate_refuted_by().
> >
> > postgres=# explain select * from h where abs(hashint4(i))%3 =
> > abs(hashint4(10))%3;
> > QUERY PLAN
> > ----------------------------------------------------------
> > Append (cost=0.00..61.00 rows=14 width=4)
> > -> Seq Scan on h (cost=0.00..0.00 rows=1 width=4)
> > Filter: ((abs(hashint4(i)) % 3) = 2)
> > -> Seq Scan on h3 (cost=0.00..61.00 rows=13 width=4)
> > Filter: ((abs(hashint4(i)) % 3) = 2)
> > (5 rows)
> >
> > Best regards,
> > Yugo Nagata
> >
> > --
> > Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> >
> >
>
>
> Regards,
>
> Rushabh Lathia

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2017-03-01 10:20:25 Re: [POC] hash partitioning
Previous Message Peter Moser 2017-03-01 09:56:50 Re: [PROPOSAL] Temporal query processing with range types