Re: Hash Functions

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>
Subject: Re: Hash Functions
Date: 2017-05-18 14:11:15
Message-ID: CA+TgmoZmCJxxNwevTod5i2Ka2gVNSv6NARtV4f94OLVvOn1V3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 18, 2017 at 1:53 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> For instance, it makes little sense to have individual check
> constraints, indexes, permissions, etc. on a hash-partitioned table.
> It doesn't mean that we should necessarily forbid them, but it should
> make us question whether combining range and hash partitions is really
> the right design.

I think that it definitely makes sense to have individual indexes on a
hash-partitioned table. If you didn't, then as things stand today,
you'd have no indexes at all, which can't be good. In the future, we
might have some system where an index created on the parent cascades
down to all of the children, but even then, you might want to REINDEX
just one of those child indexes, or better yet, create a replacement
index concurrently and then drop the old one concurrently. You might
also want to add the same sort of new index to every partition, but
not in a single operation - for reasons of load, length of maintenance
window, time for which a snapshot is held open, etc.

I agree that separate constraints and permissions on hash partitions
don't make much sense. To a lesser extent, that's true of other kinds
of partitioning as well. I mean, there is probably some use case for
setting separate permissions on a range-partitioned table, but it's a
pretty thin use case. It certainly seems possible that many users
would prefer a rule that enforces uniform permissions across the
entire partitioning hierarchy. This is one of the key things that had
to be decided in regard to the partitioning implementation we now
have: for which things should we enforce uniformity, and for which
things should we allow diversity? I advocated for enforcing
uniformity only in areas where we could see a clear advantage to it,
which led to the fairly minimal approach of enforcing only that we had
no multiple inheritance and no extra columns in the children, but
that's certainly an arguable position. Other people argued for more
restrictions, I believe out of a desire to create more administrative
simplicity, but there is a risk of cutting yourself off from useful
configurations there, and it seems very difficult to me to draw a hard
line between what is useful and what is useless.

For example, consider a hash-partitioned table. Could it make sense
to have some but not all partitions be unlogged? I think it could.
Suppose you have a cluster of machines each of which has a replica of
the same hash-partitioned table. Each server uses logged tables for
the partitions for which it is the authoritative source of
information, and unlogged tables for the others. In the event of
crash, the data for any tables that are lost are replicated from the
master for that machine. I can think of some disadvantages of that
design, but I can think of some advantages, too, and I think it's
pretty hard to say that nobody should ever want to do it. And if it's
legitimate to want to do that, then what if I want to use
trigger-based replication rather than logical replication? Then I
might need triggers on some partitions but not all, or maybe different
triggers on different partitions.

Even for a permissions grant, suppose my production system is having
some problem that can't be replicated on the test data set. Is it
reasonable to want to give a trusted developer access to a slice, but
not all of, my production data? I could allow them access to just one
partition. Maybe not a common desire, but is that enough reason to
ban it? I'd say it's arguable. I don't think that there are bright
lines around any of this stuff. My experience with this area has led
me to give up on the idea of complete uniformity as impractical, and
instead look at it from the perspective of "what do we absolutely have
to ban in order for this to be sane?".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-05-18 14:11:48 Re: 10beta1 sequence regression failure on sparc64
Previous Message Dave Page 2017-05-18 14:02:12 PostgreSQL 10 Beta 1 Released!