From: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | [WIP PATCH v2] Implement "pg_restore --data-only --clean" as a way to skip WAL |
Date: | 2025-06-10 00:26:28 |
Message-ID: | c61263f2-7472-5dd8-703d-01e683421f61@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 14 Apr 2025, Dimitrios Apostolou wrote:
> Hello list,
>
> I implemented --clean support for --data-only, in order to avoid logging to
> the WAL while populating the database. The attached patch issues a TRUNCATE
> before COPY on each worker process, and provides a significant speed
> advantage if the cluster is configure with wal_level=minimal.
>
> It also provides a safer way to load the database, as avoiding WAL logging
> also avoids potential and painful ENOSPACE on the WAL partition as I
> experienced in [1]. In other words it makes things much better for my use
> case.
>
> [1] https://www.postgresql.org/message-id/flat/076464ad-3d70-dd25-9e8f-e84f27decfba%40gmx.net
>
Rebased and attached v2 of the patch.
It needed some adjustments for the new flags --with-schema and
--with-data.
I have used this patch several times to pg_restore terabytes of tables
without logging through the WAL, and it performs great.
> But it has some rough edges. I would appreciate guidance and feedback.
The rough edges remain: TRUNCATE fails if there are foreign keys. So if
you try pg_restore --data-only --clean to a table referenced via foreign
keys, the patch will not work, as mentioned below.
>
> * When the table-to-be-TRUNCATEd is referenced as foreign key from other
> table, the whole transaction fails with:
>
> ERROR: cannot truncate a table referenced in a foreign key constraint
>
> 1. As a first step, when TRUNCATE fails I want to try a DELETE FROM
> instead, which has more chances of succeeding, and continuing with
> the COPY. How to detect the failure of ahprintf("TRUNCATE") and do
> the alternative without failing the whole transaction?
>
> 2. Why doesn't --disable-triggers help?
> To test this, I have manually issued
>
> ALTER TABLE x DISABLE TRIGGER ALL
>
> to every table and issued manual TRUNCATE still fails. Shouldn't
> postgres skip the referential integrity checks?
>
> 3. In my tests, all my tables start empty since I have just created the
> schema. Then pg_restore --data-only --clean first populates
> the /referencing/ tables, which is allowed because of disabled
> triggers, and then it tries to load the /referenced/ table.
>
> At this point the referential integrity is already broken. Getting an
> error when TRUNCATing the empty /referenced/ table doesn't make
> sense.
So is there a way to turn off the referential checks for a TRUNCATE?
Do you have any other feedback for this patch?
Thanks,
Dimitris
Attachment | Content-Type | Size |
---|---|---|
v2-0001-pg_restore-clean-data-only.patch | text/x-patch | 7.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Shinya Kato | 2025-06-10 00:43:10 | Re: Extend COPY FROM with HEADER <integer> to skip multiple lines |
Previous Message | Sami Imseih | 2025-06-10 00:14:28 | Re: add function for creating/attaching hash table in DSM registry |