From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Joe Conway <mail(at)joeconway(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, amul sul <sulamul(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: Hash Functions |
Date: | 2017-06-02 05:24:54 |
Message-ID: | CAMp0ubfHeUwsjck_R34v7PQqK7M5_WFtaCbT35Q2GcgxfDsQdA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 1, 2017 at 10:59 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> 1. Are the new problems worse than the old ones?
>
> 2. What could we do about it?
Exactly the right questions.
1. For range partitioning, I think it's "yes, a little". As you point
out, there are already some weird edge cases -- the main way range
partitioning would make the problem worse is simply by having more
users.
But for hash partitioning I think the problems will become more
substantial. Different encodings, endian issues, etc. will be a
headache for someone, and potentially a bad day if they are urgently
trying to restore on a new machine. We should remember that not
everyone is a full-time postgres DBA, and users might reasonably think
that the default options to pg_dump[all] will give them a portable
dump.
2. I basically see two approaches to solve the problem:
(a) Tom suggested at PGCon that we could have a GUC that
automatically causes inserts to the partition to be re-routed through
the parent. We could discuss whether to always route through the
parent, or do a recheck on the partition constrains and only reroute
tuples that will fail it. If the user gets into trouble, the worst
that would happen is a helpful error message telling them to set the
GUC. I like this idea.
(b) I had suggested before that we could make the default text dump
(and the default output from pg_restore, for consistency) route
through the parent. Advanced users would dump with -Fc, and pg_restore
would support an option to do partition-wise loading. To me, this is
simpler, but users might forget to use (or not know about) the
pg_restore option and then it would load more slowly. Also, the ship
is sailing on range partitioning, so we might prefer option (a) just
to avoid making any changes.
I am fine with either option.
> 2. Add an option like --dump-partition-data-with-parent. I'm not sure
> who originally proposed this, but it seems that everybody likes it.
> What we disagree about is the degree to which it's sufficient. Jeff
> Davis thinks it doesn't go far enough: what if you have an old
> plain-format dump that you don't want to hand-edit, and the source
> database is no longer available? Most people involved in the
> unconference discussion of partitioning at PGCon seemed to feel that
> wasn't really something we should be worry about too much. I had been
> taking that position also, more or less because I don't see that there
> are better alternatives.
If the suggestions above are unacceptable, and we don't come up with
anything better, then of course we have to move on. I am worrying now
primarily because now is the best time to worry; I don't expect any
horrible outcome.
> 3. Implement portable hash functions (Jeff Davis or me, not sure
> which). Andres scoffed at this idea, but I still think it might have
> legs.
I think it reduces the problem, which has value, but it's hard to make
it rock-solid.
> make fast. Those two things also solve different parts of the
> problem; one is insulating the user from a difference in hardware
> architecture, while the other is insulating the user from a difference
> in user-selected settings. I think that the first of those things is
> more important than the second, because it's easier to change your
> settings than it is to change your hardware.
Good point.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2017-06-02 05:30:19 | Re: Hash Functions |
Previous Message | Noah Misch | 2017-06-02 05:21:28 | Re: Race conditions with WAL sender PID lookups |