pg_veryfybackup can fail with a valid backup for TLI > 1

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: pg_veryfybackup can fail with a valid backup for TLI > 1
Date: 2021-08-18 05:30:31
Message-ID: 20210818.143031.1867083699202617521.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

pg_veryfybackup can fail with a valid backup when the backup was taken
from TLI > 1.

=====
# initdb
$ pg_ctl start (just to make sure)
$ pg_ctl stop
$ touch $PGDATA/standby.signal
$ pg_ctl start
$ pg_ctl promote
$ psql
postgres=# select pg_switch_wal();
pg_switch_wal
---------------
0/14FE340
(1 row)
postgres=# checkpoint;
postgres=# ^D
$ pg_basebackup -Fp -h /tmp -D tmpbk
$ pg_veryfybackup tmpbk
(thinking.. thiking.. zzz.. for several seconds)
pg_waldump: fatal: could not find file "000000020000000000000001": No such file or directory
pg_verifybackup: error: WAL parsing failed for timeline 2
=====

This is bacause the backup_manifest has the wrong values as below.

> "WAL-Ranges": [
> { "Timeline": 2, "Start-LSN": "0/14FE248", "End-LSN": "0/3000100" }
> ],

The "Start-LSN" above is the beginning of timeline 2, not the
backup-start LSN. The segment had been removed by the checkpoint.

The comment for AddWALInfoToBackupManifest() says:
> * Add information about the WAL that will need to be replayed when restoring
> * this backup to the manifest.

So I concluded that it's a thinko.

Please see the attached. It needs to be back-patched to 13 but 13
doesn't accept the patch as is due to wording chages in TAP tests.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v1-0001-Fix-basebackup-to-generate-correct-WAL-Ranges-inf_14-master.patch text/x-patch 3.0 KB
v1-0001-Fix-basebackup-to-generate-correct-WAL-Ranges-inf_13.patch text/x-patch 3.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-08-18 05:54:01 Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o
Previous Message Masahiko Sawada 2021-08-18 04:29:54 Re: Skipping logical replication transactions on subscriber side