Re: Inadequate thought about buffer locking during hot standby replay

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Inadequate thought about buffer locking during hot standby replay
Date: 2012-11-10 18:16:46
Message-ID: 25470.1352571406@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I'm inclined to think that we need to fix this by getting rid of
> RestoreBkpBlocks per se, and instead having the per-WAL-record restore
> routines dictate when each full-page image is restored (and whether or
> not to release the buffer lock immediately). That's not going to be a
> small change unfortunately :-(

Here's a WIP patch that attacks it in this way. I've only gone as far as
fixing btree_xlog_split() for the moment, since the main point is to see
whether the coding change seems reasonable. At least in this function,
it seems that locking considerations are handled correctly as long as no
full-page image is used, so it's pretty straightforward to tweak it to
handle the case with full-page image(s) as well. I've not looked
through any other replay functions yet --- some of them may need more
invasive hacking. But so far this looks pretty good.

Note that this patch isn't correct yet even for btree_xlog_split(),
because I've not removed the RestoreBkpBlocks() call in btree_redo().
All the btree replay routines will have to get fixed before it's
testable at all.

One thing that seems a bit annoying is the use of zero-based backup
block indexes in RestoreBackupBlock, when most (not all) of the callers
are referencing macros with one-based names (XLR_BKP_BLOCK_1 etc).
That's a bug waiting to happen. We could address it by changing
RestoreBackupBlock to accept a one-based argument, but what I'm thinking
of doing instead is getting rid of the one-based convention entirely;
that is, remove XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4 and instead have
all code use the XLR_SET_BKP_BLOCK() macro, which is zero-based. One
advantage of doing that is that it'll force inspection of all code
related to this.

Comments, opinions?

regards, tom lane

Attachment Content-Type Size
restore-backup-blocks-individually-0.patch text/x-patch 10.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-11-10 20:59:13 Re: Inadequate thought about buffer locking during hot standby replay
Previous Message Bruce Momjian 2012-11-10 17:41:44 Re: Further pg_upgrade analysis for many tables