| From: | Jeff Davis <pgsql(at)j-davis(dot)com> | 
|---|---|
| To: | Robert Haas <robertmhaas(at)gmail(dot)com> | 
| Cc: | pgsql-hackers(at)postgresql(dot)org | 
| Subject: | Re: page corruption on 8.3+ that makes it to standby | 
| Date: | 2010-07-28 04:23:54 | 
| Message-ID: | 1280291034.9421.6.camel@jdavis | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote:
> On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote:
> > > My first concern with that idea was that it may create an inconsistency
> > > between the primary and the standby. The primary could have a bunch of
> > > zero pages that never make it to the standby.
> > 
> > Maybe I'm slow on the uptake here, but don't the pages start out
> > all-zeroes on the standby just as they do on the primary? The only way
> > it seems like this would be a problem is if a page that previously
> > contained data on the primary was subsequently zeroed without writing
> > a WAL record - or am I confused?
> 
> The case I was concerned about is when you have a table on the primary
> with a bunch of zero pages at the end. Then you SET TABLESPACE, and none
> of the copied pages (or even the fact that they exist) would be sent to
> the standby, but they would exist on the primary. And later pages may
> have data, so the standby may see page N but not N-1.
> 
> Generally, most of the code is not expecting to read or write past the
> end of the file, unless it's doing an extension.
> 
> However, I think everything is fine during recovery, because it looks
> like it's designed to create zero pages as needed. So your idea seems
> safe to me, although I do still have some doubts because of my lack of
> knowledge in this area; particularly hot standby conflict
> detection/resolution.
> 
> My idea was different: still log the zero page, just don't set LSN or
> TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't
> as clean as your idea, but I'm a little more confident that it is
> correct.
> 
Both potential fixes attached and both appear to work.
fix1 -- Only call PageSetLSN/TLI inside log_newpage() and
heap_xlog_newpage() if the page is not zeroed.
fix2 -- Don't call log_newpage() at all if the page is not zeroed.
Please review. I don't have a strong opinion about which one should be
applied.
Regards,
	Jeff Davis
| Attachment | Content-Type | Size | 
|---|---|---|
| fix1.diff | text/x-patch | 1.4 KB | 
| fix2.diff | text/x-patch | 632 bytes | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrew Dunstan | 2010-07-28 04:41:20 | Re: PostGIS vs. PGXS in 9.0beta3 | 
| Previous Message | Robert Haas | 2010-07-28 03:43:37 | Re: Performance Enhancement/Fix for Array Utility Functions |