Re: [POC] hash partitioning

From: amul sul <sulamul(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [POC] hash partitioning
Date: 2017-03-03 13:33:22
Message-ID: CAAJ_b97QsZQZp9v5d9wGPjT4wYXc7X1fFDUu-xveN-HQax8tRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 3, 2017 at 5:00 PM, Greg Stark <stark(at)mit(dot)edu> wrote:

> On 2 March 2017 at 13:03, amul sul <sulamul(at)gmail(dot)com> wrote:
> > create table foo (a integer, b text) partition by hash (a);
> > create table foo1 partition of foo with (modulus 4, remainder 0);
> > create table foo2 partition of foo with (modulus 8, remainder 1); --
> legal,
> > modulus doesn't need to match
> > create table foo3 partition of foo with (modulus 8, remainder 4); --
> > illegal, overlaps foo1
>
>
> Instead of using modulus, why not just divide up the range of hash
> keys using ranges?
> ​ ​
> That should be just as good for a good hash

function (effectively using the high bits instead of the low bits of
> the hash value). And it would mean you could reuse the machinery for
> list partitioning for partition exclusion.
>
It also has the advantage that it's easier to see how to add more
> partitions. You just split all the ranges and (and migrate the
> data...). There's even the possibility of having uneven partitions if
> you have a data distribution skew -- which can happen even if you have
> a good hash function. In a degenerate case you could have a partition
> for a single hash of a particularly common value then a reasonable
> number of partitions for the remaining hash ranges.
>

Initially
​we
had
​to have ​
somewhat similar thought to make a range of hash
values for
​ ​
each partition, using the same half-open interval syntax we use in general:

create table foo (a integer, b text) partition by hash (a);
create table foo1 partition of foo for values from (0) to (1073741824);
create table foo2 partition of foo for values from (1073741824) to
(-2147483648);
create table foo3 partition of foo for values from (-2147483648) to
(-1073741824);
create table foo4 partition of foo for values from (-1073741824) to (0);

That's really nice for the system, but not so much for the users. The
system can
now generate each partition constraint correctly immediately upon seeing
the SQL
statement for the corresponding table, which is very desirable. However,
users are
not likely to know that the magic numbers to distribute keys equally across
four
partitions are 1073741824, -2147483648, and -1073741824.

So it's pretty
​ ​
user-unfriendly.
​​

​Regards,
Amul​

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-03 13:45:00 Re: GUC for cleanup indexes threshold.
Previous Message Ashutosh Bapat 2017-03-03 13:31:00 Re: Questions about MergeAppend