Re: In-order pg_dump (or in-order COPY TO)

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: In-order pg_dump (or in-order COPY TO)
Date: 2025-08-27 12:09:39
Message-ID: 3541781s-75o7-26pp-46pp-qs54o4406192@tzk.arg
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wednesday 2025-08-27 00:54, Adrian Klaver wrote:

>Date: Wed, 27 Aug 2025 00:54:52
>From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>To: Dimitrios Apostolou <jimis(at)gmx(dot)net>, pgsql-general(at)lists(dot)postgresql(dot)org
>Subject: Re: In-order pg_dump (or in-order COPY TO)
>
> On 8/26/25 12:43, Dimitrios Apostolou wrote:
>> Hello list,
>>
>> I am storing dumps of a database (pg_dump custom format) in a de-
>> duplicating backup server. Each dump is many terabytes in size, so
>> deduplication is very important. And de-duplication itself is based on
>> rolling checksums which is pretty flexible, it can compensate for blocks
>> moving by some offset.
>>
>> Unfortunately after I did pg_restore to a new server, I notice that the
>> dumps from the new server are not being de-duplicated, all blocks are
>> considered new.
>

> What are the pg_dump/pg_restore commands?
>
> What are the Postgres versions involved?
>
> Are they community versions of Postgres or something else?
>
> What is the depduplication program?
>
>

Dump is from PostgreSQL 16, it's pg_dump writing to stdout:

pg_dump -v --format=custom --compress=none --no-toast-compression --serializable-deferrable db_name | borg create ...

As you can see the backup (and deduplicating) program is borgbackup.

Restore is in PostgreSQL 17:

I first create the empty tables by running the DDL commands in version
control to setup the database. And then I do pg_restore --data-only:

pg_restore -vvvv -j 8 -U db_owner -d db_name --schema=public --section=data dump_file

Worth noting is that the above pg_restore goes through the WAL, i.e. all
writes are done by walwriter, not the backend directly.

Postgres is standard open source running on own server. It has a couple
of custom patches that shouldn't matter in this codepath.

>> Thanks in advance,
>> Dimitris

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dimitrios Apostolou 2025-08-27 12:34:31 Re: In-order pg_dump (or in-order COPY TO)
Previous Message Adrian Klaver 2025-08-26 22:54:52 Re: In-order pg_dump (or in-order COPY TO)