reassure me that it's good to copy pg_control last in a base backup

From: Chapman Flack <chap(at)anastigmatix(dot)net>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: reassure me that it's good to copy pg_control last in a base backup
Date: 2017-12-22 03:48:49
Message-ID: a8f5e2e8-777d-1339-afb0-21ec94ffeeb1@anastigmatix.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been using a base backup script that takes special care to
have pg_control be the last file it grabs. And I see that
basebackup.c takes similar care:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/replication/basebackup.c;h=81203c9f5ac9dbf38da09e1ff55b29846c83f514;hp=2fa1f5461356a191559b93b591c9037c6c75b389;hb=8366c7803ec3d0591cf2d1226fea1fee947d56c3;hpb=74ab96a45ef6259aa6a86a781580edea8488511a

But I need to swallow my pride and admit I'm not sure how to
reason about this. I think I'm being spooked by language in the
"WAL Internals" documentation section:

"... the checkpoint's position is saved in the file pg_control.
Therefore, at the start of recovery, the server first reads pg_control
and then the checkpoint record; then it performs the REDO operation by
scanning forward from the log position indicated in the checkpoint
record."

From that description alone, I'd imagine a danger in redoing from a
base backup in which pg_control was copied last. What if another
checkpoint was made (after the one done by pg_start_backup) during
the course of the backup, and the late-copied pg_control refers to
it, but some of the files had been copied into the base backup
too early to reflect it?

Looking harder, I think I see that the special care to grab
pg_control last was introduced for the case of taking a base backup
from a standby, and perhaps only matters in that case. The long
discussion seems to be this one:

https://www.postgresql.org/message-id/201108050646.p756kHC5023570%40ccmds32.silk.ntts.co.jp

What I think I've gleaned is:

1. The description in the doc ("at the start of recovery, the server
first reads pg_control and the checkpoint record") only applies to
the kind of recovery that happens in an unexpected restart, using
the files that are present; it's not the whole story for the kind
of recovery that begins with a base backup.

2. In the case of recovery from a backup (that was taken from a master),
both the start and end location in pg_control are disregarded, in
favor of the backup label file and the backup end WAL record,
respectively, so it doesn't matter a whit whether pg_control was
copied early or late.

3. In recovery from a backup taken from a standby, there is a backup
label file but no backup end WAL record, so the 'minimum recovery
ending location' in pg_control has to be used, and that's why the
fuss about copying pg_control last when backing up from a standby.

Did I get that right? If so, would it be worth adding some words
to that paragraph in "WAL Internals", to clarify that the pg_control
checkpoint position is not relied on when starting recovery with
a backup label present, and therefore it isn't scary to copy pg_control
late in the backup?

It all seems to make sense ultimately, but took a lot of reading
and head scratching to get there.

-Chap

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2017-12-22 04:11:07 Re: Enhance pg_stat_wal_receiver view to display connected host
Previous Message Thomas Munro 2017-12-22 03:46:21 Condition variable live lock