Re: pg_dump versus hash partitioning

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew <pgsqlhackers(at)andrewrepp(dot)com>
Subject: Re: pg_dump versus hash partitioning
Date: 2023-02-01 23:15:43
Message-ID: CAH2-WzmZ7hMNCRx_=+GRYx+sDiXQxsSX136dGMbF4jOzE7yMRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 1, 2023 at 2:49 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> It's precisely because you want to analyze it in the same terms
> as range/list partitioning that we have these issues. Or we could
> have built it on some other infrastructure than hash index opclasses
> ... but we didn't do that, and now we have a mess. I don't see a way
> out other than relaxing the guarantees about how hash partitioning
> works compared to the other kinds.

I suspect that using hash index opclasses was the right design --
sticking to the same approach to hashing seems valuable. I agree with
your overall conclusion, though, since it doesn't seem sensible to
allow hashing behavior to ever be anything greater than an
implementation detail. On general principle. What happens when a hash
function is discovered to have a huge flaw, as happened a couple of
times before now?

It's just the same with collations, where a particular collation
shouldn't be expected to have perfectly stable behavior across a dump
and reload. While admitting that possibility does open the door to
problems, in particular problems when range partitioning is in use,
those problems at least make sense. And they probably won't come up
very often -- collation updates don't often contain enormous
gratuitous differences that are liable to create dump/reload hazards
with range partitioning. It is the least worst approach, overall. In
theory, and in practice.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2023-02-01 23:22:06 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Tom Lane 2023-02-01 23:14:36 Re: pg_dump versus hash partitioning