Re: pg_dump versus hash partitioning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew <pgsqlhackers(at)andrewrepp(dot)com>
Subject: Re: pg_dump versus hash partitioning
Date: 2023-02-14 19:21:33
Message-ID: 1912821.1676402493@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's a set of draft patches around this issue.

0001 does what I last suggested, ie force load-via-partition-root for
leaf tables underneath a partitioned table with a partitioned-by-hash
enum column. It wasn't quite as messy as I first feared, although we do
need a new query (and pg_dump now knows more about pg_partitioned_table
than it used to).

I was a bit unhappy to read this in the documentation:

It is best not to use parallelism when restoring from an archive made
with this option, because <application>pg_restore</application> will
not know exactly which partition(s) a given archive data item will
load data into. This could result in inefficiency due to lock
conflicts between parallel jobs, or perhaps even restore failures due
to foreign key constraints being set up before all the relevant data
is loaded.

This made me wonder if this could be a usable solution at all, but
after thinking for awhile, I don't see how the claim about foreign key
constraints is anything but FUD. pg_dump/pg_restore have sufficient
dependency logic to prevent that from happening. I think we can just
drop the "or perhaps ..." clause here, and tolerate the possible
inefficiency as better than failing.

0002 and 0003 are not part of the bug fix, but are some performance
improvements I noticed while working on this. 0002 is pretty minor,
but 0003 is possibly significant if you have a ton of partitions.
I haven't done any performance measurement on it though.

regards, tom lane

Attachment Content-Type Size
v1-0001-Force-load-via-partition-root-for-hash-partitioni.patch text/x-diff 8.7 KB
v1-0002-Don-t-create-useless-TableAttachInfo-objects.patch text/x-diff 1.5 KB
v1-0003-Simplify-pg_dump-s-creation-of-parent-table-links.patch text/x-diff 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-02-14 20:13:18 Re: doc: add missing "id" attributes to extension packaging page
Previous Message Andres Freund 2023-02-14 19:08:12 Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)