Re: Hash Functions

From: David Fetter <david(at)fetter(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>
Subject: Re: Hash Functions
Date: 2017-05-16 16:59:43
Message-ID: 20170516165943.GB11076@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 16, 2017 at 08:10:39AM -0700, Jeff Davis wrote:
> On Mon, May 15, 2017 at 1:04 PM, David Fetter <david(at)fetter(dot)org> wrote:
> > As the discussion has devolved here, it appears that there are, at
> > least conceptually, two fundamentally different classes of partition:
> > public, which is to say meaningful to DB clients, and "private", used
> > for optimizations, but otherwise opaque to DB clients.
> >
> > Mashing those two cases together appears to cause more problems than
> > it solves.
>
> I concur at this point. I originally thought hash functions might be
> made portable, but I think Tom and Andres showed that to be too
> problematic -- the issue with different encodings is the real killer.
>
> But I also believe hash partitioning is important and we shouldn't
> give up on it yet.
>
> That means we need to have a concept of hash partitions that's
> different from range/list partitioning. The terminology
> "public"/"private" does not seem appropriate. Logical/physical or
> external/internal might be better.

I'm not attached to any particular terminology.

> With hash partitioning:
> * User only specifies number of partitions of the parent table; does
> not specify individual partition properties (modulus, etc.)

Maybe this is over-thinking it, but I'm picturing us ending up with
something along the lines of:

PARTITION BY INTERNAL(EXPRESSION)
[ WITH ( [PARAMETERS] [, AS, APPROPRIATE] ) ]

i.e. it's not clear that we should wire in "number of partitions" as a
parameter.

In a not that distant future, ANALYZE and similar could have a say in
determining both the "how" and the "whether" of partitioning.

> * Dump/reload goes through the parent table (though we may provide
> options so pg_dump/restore can optimize this)

Would it be simplest to default to routing through the immediate
ancestor for now?

It occurs to me that with the opaque partition system we're designing
here, internal partitions would necessarily be leaves in the tree.

> * We could provide syntax to adjust the number of partitions, which
> would be expensive but still useful sometimes.

Yep. I suspect that techniques for this are described in literature,
and possibly even in code bases. Any pointers?

> * All DDL should be on the parent table, including check constraints,
> FKs, unique constraints, exclusion constraints, indexes, etc.

Necessarily.

> - Unique and exclusion constraints would only be permitted if the
> keys are a superset of the partition keys.

"Includes either all of the partition expression or none of it,"
maybe?

> - FKs would only be permitted if the two table's partition schemes
> match and the keys are members of the same hash opfamily (this could
> be relaxed slightly, but it gets a little confusing if so)

Relaxing sounds like a not-in-the-first-cut feature, and subtle.

> * No attach/detach of partitions

Since they're opaque, this is the only sane thing.

> * All partitions have the same permissions

Since they're opaque, this is the only sane thing.

> * Individual partitions would only be individually-addressable for
> maintenance (like reindex and vacuum), but not for arbitrary queries

Since they're opaque, this is the only sane thing.

> - perhaps also COPY for bulk loading/dumping, in case we get clients
> smart enough to do their own hashing.

This is appealing from a resource allocation point of view in the
sense of deciding where the hash computing resources are spent. Do we
want something like the NOT VALID/VALIDATE infrastructure to support
it?

> The only real downside is that it could surprise users -- why can I
> add a CHECK constraint on my range-partitioned table but not the
> hash-partitioned one? We should try to document this so users don't
> find that out too far along. As long as they aren't surprised, I think
> users will understand why these aren't quite the same concepts.

+1

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-16 17:23:13 Re: [POC] hash partitioning
Previous Message Tom Lane 2017-05-16 16:45:39 Re: statement_timeout is not working as expected with postgres_fdw