| From: | Amul Sul <sulamul(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Tkachenko <oatkachenko(at)gmail(dot)com> |
| Subject: | pg_combinebackup: incorrect size of VM fork after combine |
| Date: | 2026-03-04 04:50:08 |
| Message-ID: | CAAJ_b97PqG89hvPNJ8cGwmk94gJ9KOf_pLsowUyQGZgJY32o9g@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
(new thread)
On Wed, Jan 7, 2026 at 8:20 PM Oleg Tkachenko <oatkachenko(at)gmail(dot)com> wrote:
>
> Hello Robert,
>
> I checked the VM fork file and found that its incremental version has a wrong
> block number in the header:
>
> ```
> xxd -l 12 INCREMENTAL.16384_vm
> 0d1f aed3 0100 0000 0000 0200 <--- 131072 blocks (1 GB)
> ^^^^ ^^^^
> ```
>
> This value can only come from the WAL summaries, so I checked them too.
> One of the summary files contains:
>
> ```
> TS 1663, DB 5, REL 16384, FORK main: limit 131073
> TS 1663, DB 5, REL 16384, FORK vm: limit 131073
> TS 1663, DB 5, REL 16384, FORK vm: block 4
>
> ```
>
> Both forks have the same limit, which looks wrong.
> So I checked the WAL files to see what really happened with the VM fork.
> I did not find any “truncate" records for the VM file.
> I only found this record for the main fork
> (actually, the fork isn’t mentioned at all):
>
> ```
> rmgr: Storage len (rec/tot): 46/46, tx: 759, lsn: 0/4600D318,
> prev 0/4600B2C8, desc: TRUNCATE base/5/16384 to 131073 blocks flags 7
> ```
>
> This suggests that the WAL summarizer may be mixing up information between
> relation forks.
>
$subject found, while discussing another bug in the incremental backup
feature [1].
The issue is when a relation spanning multiple segments (e.g., > 1 GB)
is truncated down to a single segment (or a smaller size) via VACUUM.
This action generates an SMGR_TRUNCATE_ALL WAL record. When a
subsequent incremental backup is taken and then processed by
pg_combinebackup, the resulting Visibility Map (VM) fork in the
combined backup is reconstructed with an incorrect, "insanely high"
size -- the size equal to the main fork.
I have attached a small reproducer by modifying an existing test case
and making it fail so that the file size can be checked. Apply it to
the master branch and run:
cd src/bin/pg_combinebackup/
make check PROVE_TESTS='t/011_ib_truncation.pl'
Backups used for testing will be
"tmp_check/t_011_ib_truncation_primary_data/backup/" directory and
the combined backup result in
"tmp_check/t_011_ib_truncation_node2_data/pgdata/"
If you inspect the relation forks in the final combined backup, you
will see the VM size discrepancy (16384 is the test relation oid):
ll -h tmp_check/t_011_ib_truncation_node2_data/pgdata/base/5/ | grep 16384
-rw-------. 1 amul 1.0G Feb 19 17:10 16384 <----- main fork file
-rw-------. 1 amul 8.0K Feb 19 17:10 16384.1
-rw-------. 1 amul 280K Feb 19 17:10 16384_fsm
-rw-------. 1 amul 1.0G Feb 19 17:10 16384_vm. <----- vm fork file (1 GB)
The reason, as Oleg explained in the same thread [1], is that the
summary file recorded an incorrect size limit for the VM fork due to a
truncation WAL record with the SMGR_TRUNCATE_ALL flag.
I think the fix will be to correct the wal summary entry that records
an incorrect truncation limit for the VM fork. Attached are the
patches: 0001 is a refactoring patch that moves the necessary macro
definitions from visibilitymap.c to visibilitymap.h to correctly
calculate the VM fork limit recorded in the wal summary file, and 0002
provides the actual fix.
1] http://postgr.es/m/6897DAF7-B699-41BF-A6FB-B818FCFFD585@gmail.com
--
Regards,
Amul Sul
EDB: http://www.enterprisedb.com
| Attachment | Content-Type | Size |
|---|---|---|
| repro-incremental_backup.patch.no-cfbot | application/octet-stream | 1.4 KB |
| v1-0001-Refactor-Expose-visibility-map-mapping-macros-in-.patch | application/x-patch | 3.4 KB |
| v1-0002-Fix-incorrect-VM-fork-truncation-limit-in-WAL-sum.patch | application/x-patch | 1.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2026-03-04 04:52:02 | Re: change default default_toast_compression to lz4? |
| Previous Message | Robert Treat | 2026-03-04 04:42:44 | Re: [PROPOSAL] Doublewrite Buffer as an alternative torn page protection to Full Page Write |