Hash Functions

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>
Subject: Hash Functions
Date: 2017-05-19 06:36:39
Message-ID: CAMp0ubfHWgYFChS50QmFyz6cKoey9fGt1CgFn8TK8p3XnY+G8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, May 18, 2017, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> My experience with this area has led
> me to give up on the idea of complete uniformity as impractical, and
> instead look at it from the perspective of "what do we absolutely have
> to ban in order for this to be sane?".

I could agree to something like that. Let's explore some of the challenges
there and potential solutions:

1. Dump/reload of hash partitioned data.

Falling back to restore-through-the-root seems like a reasonable answer
here. Moving to a different encoding is not an edge case, but it's not
common either, so a performance penalty seems acceptable. I'm not
immediately sure how we'd implement this in pg_dump/restore, so I'd feel a
little more comfortable if I saw a sketch.

2. Having a lot of hash partitions would be cumbersome

The user would need to create and manage each partition, and try to do
global operations in a sane way. The normal case would probably involve
scripts to do things like add an index to all partitions, or a column. Many
partitions would also just pollute the namespace unless you remember to put
them in a separate schema (yes, it's easy, but most people will still
forget). Some syntax sugar would go a long way here.

3. The user would need to specify details they really don't care about for
each partition.

Things like "modulus 16, remainder 0", "modulus 16, remainder 1" are
tedious boilerplate. And if the user makes a mistake, then 1/16 of inserts
start failing. Probably would be caught during testing, but not exactly a
good user experience. I'm not thrilled about this, considering that all the
user really wants is 16 partitions, but it's not the end of the world.

4. Detach is a foot-gun

If you detach a partition, random inserts will start failing. Not thrilled
about this, but a hapless user would accept most of the blame if they
stumble over it. Another way of saying this is with hash partitioning you
really need the whole set for the table to be online at all. But we can't
really enforce that, because it would limit some of the flexibility that
you have in mind.

Stepping back, your approach might be closer to the general postgres
philosophy of allowing the user to assemble from spare parts first, then a
few releases later we offer some pre-built subassemblies, and a few
releases later we make the typical cases work out of the box. I'm fine with
it as long as we don't paint ourselves into a corner.

Of course we still have work to do on the hash functions. We should solve
at least the most glaring portability problems, and try to harmonize the
hash opfamilies. If you agree, I can put together a patch or two.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Higuchi, Daisuke 2017-05-19 07:05:30 Re: [Bug fix]If recovery.conf has target_session_attrs=read-write, the standby fails to start.
Previous Message Amit Langote 2017-05-19 06:35:20 Re: transition table behavior with inheritance appears broken (was: Declarative partitioning - another take)