Re: page corruption on 8.3+ that makes it to standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: page corruption on 8.3+ that makes it to standby
Date: 2010-07-28 17:18:54
Message-ID: AANLkTi=x+9dHgA0Yy9xPV4bTSP18_wxb=qo75PhEVGXG@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 28, 2010 at 12:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> However, when Simon said "We definitely shouldn't do anything that
>> leaves standby different to primary." you said "obviously". Fix2 can
>> leave a difference between the two, because zeroed pages at the end of
>> the heap file on the primary will not be sent to the standby (the
>> standby will only create the zeroed pages if a higher block number is
>> sent; which won't be the case if the zeroed pages are at the end).
>
>> As we discussed before, that looks inconsequential, but I just want to
>> make sure that it's understood.
>
> I understand it, and I don't like it one bit.  I haven't caught up on
> this thread yet, but I think the only acceptable solution is one that
> leaves the slave in the *same* state as the master.  Not a state that
> we hope will behave equivalently.  I can think of too many corner cases
> where it might not.  (In fact, having a zeroed page in a relation is
> already a corner case in itself, so the amount of testing you'd get for
> such behaviors is epsilon squared.  You don't want to take that bet.)

I might be missing something here, but I don't see how you're going to
manage that. In Jeff's original example, he crashes the database
after extending the relation but before initializing and writing the
new page. I believe that at that point no XLOG has been written yet,
so the relation has been extended but there is no WAL to be sent to
the standby. So now you have the exact situation you're concerned
about - the relation has been extended on the master but not on the
standby. As far as I can see, this is an unavoidable consequence of
the fact that we don't XLOG the act of extending the relation.
Worrying about it only in the specific context of ALTER TABLE .. SET
TABLESPACE seems backwards; if there are any bugs there, we're in for
it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2010-07-28 17:28:23 Re: page corruption on 8.3+ that makes it to standby
Previous Message Jeff Davis 2010-07-28 17:18:28 Re: page corruption on 8.3+ that makes it to standby