From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Inadequate thought about buffer locking during hot standby replay |
Date: | 2012-11-10 20:59:13 |
Message-ID: | CA+U5nMLAuMzfOd4eFmU-MBGWFey8frJkYGXMiqy-OMwTFthfMA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10 November 2012 18:16, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> I'm inclined to think that we need to fix this by getting rid of
>> RestoreBkpBlocks per se, and instead having the per-WAL-record restore
>> routines dictate when each full-page image is restored (and whether or
>> not to release the buffer lock immediately). That's not going to be a
>> small change unfortunately :-(
>
> Here's a WIP patch that attacks it in this way. I've only gone as far as
> fixing btree_xlog_split() for the moment, since the main point is to see
> whether the coding change seems reasonable. At least in this function,
> it seems that locking considerations are handled correctly as long as no
> full-page image is used, so it's pretty straightforward to tweak it to
> handle the case with full-page image(s) as well. I've not looked
> through any other replay functions yet --- some of them may need more
> invasive hacking. But so far this looks pretty good.
Looks fine, but we need a top-level comment in btree code explaining
why/how it follows the very well explained rules in the README.
> Note that this patch isn't correct yet even for btree_xlog_split(),
> because I've not removed the RestoreBkpBlocks() call in btree_redo().
> All the btree replay routines will have to get fixed before it's
> testable at all.
I was just about to write you back you missed that...
> One thing that seems a bit annoying is the use of zero-based backup
> block indexes in RestoreBackupBlock, when most (not all) of the callers
> are referencing macros with one-based names (XLR_BKP_BLOCK_1 etc).
> That's a bug waiting to happen. We could address it by changing
> RestoreBackupBlock to accept a one-based argument, but what I'm thinking
> of doing instead is getting rid of the one-based convention entirely;
> that is, remove XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4 and instead have
> all code use the XLR_SET_BKP_BLOCK() macro, which is zero-based. One
> advantage of doing that is that it'll force inspection of all code
> related to this.
I wouldn't do that in a back branch, but I can see why its a good idea.
Thanks for fixing.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-11-10 21:24:22 | Re: Inadequate thought about buffer locking during hot standby replay |
Previous Message | Tom Lane | 2012-11-10 18:16:46 | Re: Inadequate thought about buffer locking during hot standby replay |