Re: pg_dump versus hash partitioning

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew <pgsqlhackers(at)andrewrepp(dot)com>
Subject: Re: pg_dump versus hash partitioning
Date: 2023-02-02 00:15:28
Message-ID: CAApHDvqJM5HxhZSnxESENAHLxTZ=eXKKHxeHfvVe1E8v28woDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2 Feb 2023 at 11:38, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> > You mentioned "minor releases" here. Who said anything about that?
>
> I did: I'd like to back-patch the fix if possible. I think changing
> the default --load-via-partition-root choice could be back-patchable.
>
> If Robert is resistant to that but would accept it in master,
> I'd settle for that in preference to having no fix.

I'm not sure it'll help the discussion any, but on master, the
performance gap between using --load-via-partition-root and not using
it should be somewhat closed due to 3592e0ff9. So using that is likely
not as terrible as it once was.

[1] does not show results for inserting directly into partitions, but
the benchmark results do show that performance is better than it was
without the caching. The order that pg_dump outputs the rows should
mean the cache is hit most of the time for RANGE partitioned tables,
at least, and likely more often than not for LIST. HASH partitioning
is not really affected by that commit. The idea there is that it's
probably as cheap to hash as it is to do an equality check with the
last Datum.

Digging into the history a bit, I found [2] and particularly [3] that
seem to indicate this option was thought about due to concerns about
hash functions not returning consistent results on different
architectures. I suspect it might have been defaulted to load into the
leaf partitions for performance reasons, however. I mean, why else
would you?

David

[1] https://www.postgresql.org/message-id/CAApHDvqFeW5hvQqprXOLuGMMJSf%2B1C%2BWk4w_L-M03sVduF3oYg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAGPqQf0C1he087bz9xRBOGZBuESYz9X=Fp8Ca_g+TfHgAff75g@mail.gmail.com
[3] https://www.postgresql.org/message-id/CA%2BTgmoZFn7TJ7QBsFatnuEE%3DGYGdZSNXqr9489n5JBsdy5rFfA%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-02-02 01:02:25 Re: CI and test improvements
Previous Message Peter Smith 2023-02-01 23:22:06 Re: Perform streaming logical transactions by background workers and parallel apply