From: | Dimitrios Apostolou <jimis(at)gmx(dot)net> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | In-order pg_dump (or in-order COPY TO) |
Date: | 2025-08-26 19:43:44 |
Message-ID: | s0491qrn-343s-0757-8sn5-120rr8610qqq@tzk.arg |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello list,
I am storing dumps of a database (pg_dump custom format) in a
de-duplicating backup server. Each dump is many terabytes in size, so
deduplication is very important. And de-duplication itself is based on
rolling checksums which is pretty flexible, it can compensate for blocks
moving by some offset.
Unfortunately after I did pg_restore to a new server, I notice that the
dumps from the new server are not being de-duplicated, all blocks are
considered new.
This means that the data has been significantly altered. The new dumps
contain the same rows but probably in very different order. Could the
row-order have changed when doing COPY FROM with pg_restore? No idea,
but now that I think about it this can happen by many operations, like
CLUSTER, VACUUM FULL etc so the question still applies.
A *logical* dump of data shouldn't be affected by on-disk order.
Internal representation shouldn't affect the output.
This makes me wonder: Is there a way to COPY TO in primary-key order?
If that is possible, then pg_dump could make use of it.
Thanks in advance,
Dimitris
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2025-08-26 20:16:39 | Re: How to configure client-side TLS ciphers for streaming replication? |
Previous Message | Ron Johnson | 2025-08-26 18:43:46 | Re: Feature request: A method to configure client-side TLS ciphers for streaming replication |