ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized
Date: 2025-06-03 15:34:12
Message-ID: 5f002654-92e9-8032-3e1a-134461b4e819@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello list,

I'm debugging the abysmal performance of pg_restoring a huge
(10TB) database, which includes a table with more than 1000 partitions.

As part of pg_restore -j... --section=post-data
I see *days* being spent in the sequential creation of foreign keys:

ALTER TABLE the_master_partitioned_table
ADD CONSTRAINT ...
FOREIGN KEY (columnX) REFERENCES another_table(columnX)

Each of the above queries take 1-2 days, and multiple such queries are
issued sequentially.

My questions:

+ This master_partitioned_table contains at least 5 foreign key
constraints from different columns to different tables.
Despite pg_restore being invoked with -j, they all run sequentially.
I guess pg_restore dependency resolution decides so. Is there a
reason for that?

+ The backend process for each of the above ALTER TABLE commands, does not
parallelize the foreign key checks for the different partitions. I
know, because in the logs I see gigabytes of temporary files being
written, with the CONTEXT showing queries issued incrementally on
all the different partitions:

:LOG: temporary file: path "pg_tblspc/16390/PG_17_202406281/pgsql_tmp/pgsql_tmp3363462.579", size 1073741824
:CONTEXT: SQL statement "SELECT fk."columnX" FROM ONLY "public"."table_partition_214" fk
LEFT OUTER JOIN ONLY "public"."another_table" pk
ON ( pk."columnX" OPERATOR(pg_catalog.=) fk."columnX")
WHERE pk."columnX" IS NULL AND (fk."columnX" IS NOT NULL)"

Why can't the backend issue these queries in parallel workers?

+ Based on the pg_restore manual, I am experimenting with manually issuing
DISABLE TRIGGERS before the restoration, but I can't see a difference
when I'm doing the section=post-data separately. Is it supposed to speed
things up?

Thanks in advance,
Dimitris

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Frédéric Yhuel 2025-06-04 07:00:00 Re: ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized
Previous Message Pavel Stehule 2025-06-03 11:43:21 Re: proposal: schema variables