parallel restore sometimes fails for FKs to partitioned tables

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: parallel restore sometimes fails for FKs to partitioned tables
Date: 2019-10-05 22:43:33
Message-ID: 20191005224333.GA9738@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

While playing around I noticed that depending on the number of parallel
workers in pg_restore compared to the number of partitions a table has,
restoring an FK fails because the FK itself is restored before the index
partitions have completed restoring. The exact conditions to cause the
failure seem to vary depending on whether the dump is schema-only or not.

This can seemingly be fixed by having pg_dump make the constraint depend
on the attach of each partition, as in the attached patch. With this
patch I no longer see failures.

This patch is a bit weird because I added a new "simple list" type, to
store pointers. One alternative would be to store the dumpId values for
the partitions instead, but we don't have a dumpId-typed simple list
either. We could solve that by casting the dumpId to Oid, but that
seems almost as strange as the current proposal.

The other thing that makes this patch a little weird is that we have to
scan the list of indexes in the referenced partitioned table in order to
find the correct one. This should be okay, as the number of indexes in
any one table is not expected to grow very large. This isn't easy to
fix because we don't have a bsearchable array of indexes like we do of
other object types, and this already requires some contortions nearby.
Still, I'm not sure that this absolutely needs fixing now.

--
Álvaro Herrera Developer, https://www.PostgreSQL.org/

Attachment Content-Type Size
dump-fks-parallel.patch text/x-diff 6.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2019-10-06 02:20:38 Re: expressive test macros (was: Report test_atomic_ops() failures consistently, via macros)
Previous Message Tom Lane 2019-10-05 22:33:13 Re: New "-b slim" option in 2019b zic: should we turn that on?