Re: pg_dump versus hash partitioning

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew <pgsqlhackers(at)andrewrepp(dot)com>
Subject: Re: pg_dump versus hash partitioning
Date: 2023-02-01 21:12:31
Message-ID: CAH2-Wzn+7oO+NyorybEe==xNP_oye9soe3kv03HUxg+dOv+PzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 1, 2023 at 12:34 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Also, and I think pretty
> > significantly, using --load-via-partition-root forces you to pay the
> > overhead of rerouting every tuple to the target partition whether you
> > need it or not, which is potentially a large unnecessary expense.
>
> Oddly, I always thought that we prioritize correctness over speed.
> I don't mind having an option that allows people to select a less-safe
> way of doing this, but I do think it's unwise for less-safe to be the
> default, especially when it's something you can't fix after the fact.

+1

As you pointed out already, pg_dump's primary use-cases all involve
target systems that aren't identical to the source system. Anybody
using pg_dump is unlikely to be particularly concerned about
performance.

> What do you think of "--load-via-partition-root=on/off/auto", where
> auto means "not with hash partitions" or the like? I'm still
> uncomfortable about the collation aspect, but I'm willing to concede
> that range partitioning is less likely to fail in this way than hash.

Currently, pg_dump ignores collation versions entirely, except when
run by pg_upgrade. So pg_dump already understands that sometimes it's
important that the collation behavior be completely identical when the
database is restored, and sometimes it's desirable to produce a dump
with any available "logically equivalent" collation. This is about the
high level requirements, which makes sense to me.

ISTM that range partitioning is missing a good high level model that
builds on that. What's really needed is a fully worked out abstraction
that recognizes how collations can be equivalent for some purposes,
but not other purposes. The indirection between "logical and physical
collations" is underdeveloped. There isn't even an official name for
that idea.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2023-02-01 21:12:54 Re: heapgettup refactoring
Previous Message Tom Lane 2023-02-01 21:12:13 Re: pg_dump versus hash partitioning