Re: Hash Functions

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>
Subject: Re: Hash Functions
Date: 2017-05-17 06:18:06
Message-ID: CAFjFpRcXciUafxquuvjSF+cLPjxB74DyM_2meg7P8TZdXR2DUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 16, 2017 at 8:40 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Mon, May 15, 2017 at 1:04 PM, David Fetter <david(at)fetter(dot)org> wrote:
>> As the discussion has devolved here, it appears that there are, at
>> least conceptually, two fundamentally different classes of partition:
>> public, which is to say meaningful to DB clients, and "private", used
>> for optimizations, but otherwise opaque to DB clients.
>>
>> Mashing those two cases together appears to cause more problems than
>> it solves.
>
> I concur at this point. I originally thought hash functions might be
> made portable, but I think Tom and Andres showed that to be too
> problematic -- the issue with different encodings is the real killer.
>
> But I also believe hash partitioning is important and we shouldn't
> give up on it yet.
>
> That means we need to have a concept of hash partitions that's
> different from range/list partitioning. The terminology
> "public"/"private" does not seem appropriate. Logical/physical or
> external/internal might be better.
>
> With hash partitioning:
> * User only specifies number of partitions of the parent table; does
> not specify individual partition properties (modulus, etc.)

a well distributed integer column doesn't even need to be hashed, a
simple modulo works with it. If we are going towards "implicit" (yet
another name) partitioning, we could choose the strategy based on the
data type of the partition key, not just hash it always. Although, we
might end up hashing it in most of the cases.

> * Dump/reload goes through the parent table (though we may provide
> options so pg_dump/restore can optimize this)

Probably you imply immediate hash partitioned parent, but just let me
clarify it a bit. We support multi-level partitioning with each
partitioned table anywhere in the partitioning hierarchy choosing any
partitioning scheme. So, we can have range partitioned table as a
partition of a hash partitioned table or for that matter, a non-hash
partitioned table which is somewhere in the hiearchy rooted at the
hash partitioned table. So, for range/list even hash partitions that
are grand-children of a hash partitioned table, we will need to route
dump/reload through that hash partitioned table i.e. route it through
the topmost hash-partitioned table.

> * We could provide syntax to adjust the number of partitions, which
> would be expensive but still useful sometimes.
> * All DDL should be on the parent table, including check constraints,
> FKs, unique constraints, exclusion constraints, indexes, etc.

i.e. topmost hash partitioned table as explained above.

> - Unique and exclusion constraints would only be permitted if the
> keys are a superset of the partition keys.

Do you think this constraint apply even after we support global
indexes? Isn't this applicable to all the partitioning strategies?

> - FKs would only be permitted if the two table's partition schemes
> match and the keys are members of the same hash opfamily (this could
> be relaxed slightly, but it gets a little confusing if so)
> * No attach/detach of partitions

It will be good, if we can support this for maintenance purpose. If a
partition goes bad, we could replace it with its copy somewhere, using
attach and detach without affecting the whole table. Now does that
mean, that we will need to support some form of pg_dump/copy with
special flag to create copies of individual partitions? I think that
proposal has already been floated.

> * All partitions have the same permissions

Why's that?

>
> The only real downside is that it could surprise users -- why can I
> add a CHECK constraint on my range-partitioned table but not the
> hash-partitioned one? We should try to document this so users don't
> find that out too far along. As long as they aren't surprised, I think
> users will understand why these aren't quite the same concepts.
>

For a transparent hash (non-transparent in the sense of what Mark
Dilger proposed), any constraint other than implicit partitioning
constraint is applicable to the whole table or it's not applicable at
all. So, better if user adds it on the parent hash table. So, yes,
with this reasoning, we could document this fact.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-05-17 06:21:31 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Amit Langote 2017-05-17 06:04:42 Re: transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)