Re: pg_dump versus hash partitioning

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew <pgsqlhackers(at)andrewrepp(dot)com>
Subject: Re: pg_dump versus hash partitioning
Date: 2023-03-11 03:01:54
Message-ID: 20230311030154.pg63uspio35h4a6t@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 14, 2023 at 02:21:33PM -0500, Tom Lane wrote:
> Here's a set of draft patches around this issue.
>
> 0001 does what I last suggested, ie force load-via-partition-root for
> leaf tables underneath a partitioned table with a partitioned-by-hash
> enum column. It wasn't quite as messy as I first feared, although we do
> need a new query (and pg_dump now knows more about pg_partitioned_table
> than it used to).
>
> I was a bit unhappy to read this in the documentation:
>
> It is best not to use parallelism when restoring from an archive made
> with this option, because <application>pg_restore</application> will
> not know exactly which partition(s) a given archive data item will
> load data into. This could result in inefficiency due to lock
> conflicts between parallel jobs, or perhaps even restore failures due
> to foreign key constraints being set up before all the relevant data
> is loaded.
>
> This made me wonder if this could be a usable solution at all, but
> after thinking for awhile, I don't see how the claim about foreign key
> constraints is anything but FUD. pg_dump/pg_restore have sufficient
> dependency logic to prevent that from happening. I think we can just
> drop the "or perhaps ..." clause here, and tolerate the possible
> inefficiency as better than failing.

Working on some side project that can cause dump of hash partitions to be
routed to a different partition, I realized that --load-via-partition-root can
indeed cause deadlock in such case without FK dependency or anything else.

The problem is that each worker will perform a TRUNCATE TABLE ONLY followed by
a copy of the original partition's data in a transaction, and that obviously
will lead to deadlock if the original and locked partition and the restored
partition are different.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dean Rasheed 2023-03-11 03:07:53 Re: Lock mode in ExecMergeMatched()
Previous Message Melanie Plageman 2023-03-11 00:34:44 Re: Should vacuum process config file reload more often