Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: Stepan Neretin <slpmcf(at)gmail(dot)com>
Cc: Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org, bruce(at)momjian(dot)us, Christophe Courtois <christophe(dot)courtois(at)dalibo(dot)com>
Subject: Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized
Date: 2025-07-23 21:23:47
Message-ID: q479op22-n5pp-n434-n25n-44or777p5r80@tzk.arg
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello Stepan!

I can see you tested my patch by itself, and with the following command:

On Monday 2025-07-21 07:58, Stepan Neretin wrote:

>      pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists --freeze "$DUMP_FILE" > /dev/null

On the other hand, I have tested combining --freeze with --data-only
--clean, as implemented by another patch of mine at:

https://www.postgresql.org/message-id/flat/413c1cd8-1d6d-90ba-ac7b-b226a4dad5ed%40gmx.net#44f06ce85a14a6238125a0ad8c9767db

In fact I was considering that patch as a requirement to this one, but I
see that you are using my patch independently to issue --clean --freeze
without --data-only. As far as I understand, this will also issue a
TRUNCATE before each COPY, and the benefits of FREEZE will be the same.
Thank you for revealing this to me.

>The main area of concern is the implementation method in `pg_backup_archiver.c`. The patch introduces a `do_copy` function that modifies the `COPY` statement string to inject the `WITH (FREEZE)` option.
>
>```c
>if (cp_end - 10 > cp &&
>    strncmp(cp_end - 10, "FROM stdin", 10) == 0
>    )
>    cp_is_well_formed = true;
>```
>
>This approach of string parsing is quite fragile. It makes a strong assumption that the `COPY` statement generated by `pg_dump` will always end with `... FROM stdin;` (preceded by optional whitespace). While this is true now, it creates a tight and brittle coupling between `pg_dump`'s output format and
>`pg_restore`'s parsing logic. Any future changes to `pg_dump` that might add options or slightly alter the `COPY` syntax could break this feature silently or with a non-obvious warning.

It troubled me too. At least I have a pg_log_warning() message in this
case, and nothing detrimental will happen, it's just that the FREEZE
option will not be used and the message will be logged.

>
>A more robust solution would be to avoid string manipulation entirely. `pg_restore` should assemble the `COPY` command from its constituent parts (table name, column list, etc.) and then conditionally append the `WITH (FREEZE)` clause before the final `FROM stdin;`. This would decouple the feature from
>the exact string representation in the dump archive.

Totally agree, but this sounds impossible to implement. Unfortunately
pg_dump generates SQL commands, and pg_restore edits them and executes
them. I wouldn't know where to start to change this whole architecture.

>
>An alternative—and arguably cleaner—approach might be to shift this logic to pg_dump.

Thinking it more thoroughly, pg_dump is not the place to compose the
restoration SQL commands. The proper place is pg_restore.
I guess it's being done in pg_dump for historical reasons, but I don't
think pg_dump should complicate its commands further (e.g. by adding
WITH FREEZE), as this is a choice to make during restore.
I think I've seen more places in pg_restore that edit the
commands before executing them.

>One important consideration that needs to be highlighted in the documentation for this feature is the impact on WAL generation. `COPY ... WITH (FREEZE)` is known to generate significantly more WAL traffic because it must log the freezing of tuples, which can be a surprise for users. Maybe we can insert
>already frozen pages?

There should be no WAL traffic at all. If the transaction starts with a
TRUNCATE TABLE then COPY skips the wal completely. This gives a big
performance and stability gain when bulk-loading data, and this is what
my 2 patches try to leverage in more cases.

And as far as I can tell, if there is no TRUNCATE in the same
transaction, then pg_restore will output error like the following:

ERROR: cannot perform COPY FREEZE because the table was not created or truncated in the current subtransaction

I hope such an erroris acceptable, since the sysadmin can just remove
--freeze then and re-run the command.

>Additionally, it should be noted that the freeze option only works correctly when performing a fresh load of data into empty tables.

Do you think this patch has chances of going in by itself? If so then
yes, I should definitely update the docs and submit it again
individually.

Thank you for the rebase and review.

Dimitris

In response to

Browse pgsql-performance by date

  From Date Subject
Previous Message Pavel Stehule 2025-07-23 15:25:06 Re: proposal: schema variables