Re: WAL dirty-buffer management bug

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL dirty-buffer management bug
Date: 2006-03-31 08:58:52
Message-ID: e0ir6p$1t0i$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote
>
> This is pretty much what heapam and btree currently do, but on looking
> at it I think it's got a problem: we really ought to mark the buffer
> dirty before releasing the critical section. Otherwise, if there's an
> elog(ERROR) before the WriteBuffer call is reached, the backend would go
> on about its business, and we'd have changes in a disk buffer that isn't
> marked dirty. The changes would be uncommitted, presumably, because of
> the error --- but nonetheless this could result in inconsistency down
> the road. One example scenario is:
> 1. We insert a tuple at, say, index 3 on a page.
> 2. elog after making the XLOG entry, but before WriteBuffer.
> 3. page is later discarded from shared buffers; since it's not
> marked dirty, it'll just be dropped without writing it.
> 4. Later we need to insert another tuple in same table, and
> we again choose index 3 on this page as the place to put it.
> 5. system crash leads to replay from WAL.
> Now we'll have two different WAL records trying to insert tuple 3.
> Not good.
>

It may be not good but not harmful either. On step2, the transaction will
abort and leave a page that has been changed but not marked dirty. There are
two situtations could happen after that. One is step 3, the other is the
page is still in the buffer pool and another transaction will write on it
(no problem, the tuple slot is already marked used). For step 3, yes, we
will see two WAL records trying to insert to the same tuple slot, but the
2nd one will cover the 1st one -- no problem. If the 2nd one will not cover
the 1st one (say that WAL record is broken), also no prolbem since the tuple
header will gaurantee that tuple is invisible. Can you give an example that
this will lead data corruption?

Regards,
Qingqing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qingqing Zhou 2006-03-31 09:16:59 Re: PANIC: heap_update_redo: no block
Previous Message Philipp Ott 2006-03-31 08:05:06 Postgres Library natively available for Mac OSX Intel?