From: | Stepan Neretin <slpmcf(at)gmail(dot)com> |
---|---|
To: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
Cc: | Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org, bruce(at)momjian(dot)us, Christophe Courtois <christophe(dot)courtois(at)dalibo(dot)com> |
Subject: | Re: [PATCH] ALTER TABLE ADD FOREIGN KEY to partitioned table, is not parallelized |
Date: | 2025-07-21 05:58:56 |
Message-ID: | CA+Yyo5RA+KQPYZnsaKRRYpAKUpVKt2_xviKrqNrX_6pD7HkDpg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Mon, Jul 21, 2025 at 12:24 PM Dimitrios Apostolou <jimis(at)gmx(dot)net> wrote:
> >
> > FWIW I implemented a pg_restore --freeze patch, see attached. It needs
> > another patch of mine from [1] that implements pg_restore --data-only
> > --clean, which for parallel restores encases each COPY in its own
> transaction
> > and prepends it with a TRUNCATE. All feedback is welcome.
> >
> > [1]
> https://www.postgresql.org/message-id/c61263f2-7472-5dd8-703d-01e683421f61%40gmx.net
> >
> > It works really fast for the data, and I see that some, but not all
> items
> > from section=post-data, start parallel plans. For example I see CREATE
> INDEX
> > spawns parallel workers.
>
> I added it to July's commitfest, mostly to trigger discussion around the
> issue.
>
> https://commitfest.postgresql.org/patch/5826/
>
> Not sure how to mark on the commitfest page that the patch requires
> another patch from another commitfest entry.
>
>
> Dimitris
>
>
>
>
Hello Dimitris and fellow hackers,
I'm And Warda(currently unavailable :( ) picking this patch up from the
July CommitFest for review. Thanks, Dimitris, for submitting it and for
tackling this well-known performance issue with restoring very large
databases. The problem of `ALTER TABLE ... ADD FOREIGN KEY` taking days
after a restore due to the lack of a visibility map is a significant pain
point, and your proposed solution is a logical step.
I've reviewed the patch and the discussion in the thread. I've attached a
rebased version of the patch against the current master.
The core idea of adding a `--freeze` option to `pg_restore` to leverage
`COPY ... WITH (FREEZE)` is sound. It directly addresses the problem of
needing a very long `VACUUM` run after data loading to make subsequent
`post-data` steps, like foreign key validation, performant. The benefit
isn't just a minor speedup in the `COPY` itself but the major time-saving
of avoiding a full-table `VACUUM` on a massive dataset.
I ran a simple performance script based on your patch. On a small test
case, I saw a consistent 10-15% speedup during the data-loading phase.
While this is encouraging, the main benefit, as noted, is in the steps that
follow.
=== Summary ===
freeze average: 1.205 seconds
nofreeze average: 1.324 seconds
~/.pg-vanilla/bin ᐅ ./tst.sh
=== Summary ===
freeze average: 1.362 seconds
nofreeze average: 1.480 seconds
#!/bin/bash
set -euo pipefail
DUMP_FILE="parttest.dump"
DB_PREFIX="parttest"
JOBS=$(nproc)
ROUNDS=3
timings() {
local mode=$1
local total_time=0
for i in $(seq 1 $ROUNDS); do
DB_NAME="${DB_PREFIX}_${mode}_$i"
psql -q -c "DROP DATABASE IF EXISTS $DB_NAME"
psql -q -c "CREATE DATABASE $DB_NAME"
start_time=$(date +%s.%N)
if [ "$mode" == "freeze" ]; then
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists --freeze
"$DUMP_FILE" > /dev/null
else
pg_restore -d "$DB_NAME" -j "$JOBS" --clean --if-exists "$DUMP_FILE"
> /dev/null
fi
end_time=$(date +%s.%N)
duration=$(echo "$end_time - $start_time" | bc)
total_time=$(echo "$total_time + $duration" | bc)
done
average=$(echo "scale=3; $total_time / $ROUNDS" | bc)
echo "$mode average: $average seconds"
}
freeze_avg=$(timings freeze)
nofreeze_avg=$(timings nofreeze)
echo ""
echo "=== Summary ==="
echo "$freeze_avg"
echo "$nofreeze_avg"
The main area of concern is the implementation method in
`pg_backup_archiver.c`. The patch introduces a `do_copy` function that
modifies the `COPY` statement string to inject the `WITH (FREEZE)` option.
```c
if (cp_end - 10 > cp &&
strncmp(cp_end - 10, "FROM stdin", 10) == 0
)
cp_is_well_formed = true;
```
This approach of string parsing is quite fragile. It makes a strong
assumption that the `COPY` statement generated by `pg_dump` will always end
with `... FROM stdin;` (preceded by optional whitespace). While this is
true now, it creates a tight and brittle coupling between `pg_dump`'s
output format and `pg_restore`'s parsing logic. Any future changes to
`pg_dump` that might add options or slightly alter the `COPY` syntax could
break this feature silently or with a non-obvious warning.
A more robust solution would be to avoid string manipulation entirely.
`pg_restore` should assemble the `COPY` command from its constituent parts
(table name, column list, etc.) and then conditionally append the `WITH
(FREEZE)` clause before the final `FROM stdin;`. This would decouple the
feature from the exact string representation in the dump archive.
An alternative—and arguably cleaner—approach might be to shift this logic
to pg_dump.
One important consideration that needs to be highlighted in the
documentation for this feature is the impact on WAL generation. `COPY ...
WITH (FREEZE)` is known to generate significantly more WAL traffic because
it must log the freezing of tuples, which can be a surprise for users.
Maybe we can insert already frozen pages?
Additionally, it should be noted that the freeze option only works
correctly when performing a fresh load of data into empty tables.
Thanks again for your work on this.
Regards,
Stepan Neretin
Attachment | Content-Type | Size |
---|---|---|
0001-pg_restore-freeze.patch | text/x-patch | 5.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2025-07-22 05:53:26 | Re: proposal: schema variables |
Previous Message | Tom Lane | 2025-07-20 01:57:50 | Re: Question: Is it valid for a parent node's total cost to be lower than a child's total cost in EXPLAIN? |