Bug in visibility hint bit

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Bug in visibility hint bit
Date: 2009-08-25 01:23:08
Message-ID: f67928030908241823q417ea005sd03b2d888ea84ad@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There seems to be a bug in the visibility map in 8.4.0, introduced to
cvs on 2008-12-03. It results in tuples being called visible that
shouldn't be.

In heap_update function from heapam.c:

/*
* Note: we mustn't clear PD_ALL_VISIBLE flags before writing the WAL
* record, because log_heap_update looks at those flags to set the
* corresponding flags in the WAL record.
*/

So the full_page_write of the block sent to WAL has the wrong
PD_ALL_VISIBLE. It needs to be fixed during WAL replay after a crash.
But it is not.

In heap_xlog_update:

if (record->xl_info & XLR_BKP_BLOCK_1)
{
if (samepage)
return; /* backup
block covered both changes */
goto newt;
}

The goto newt causes it to skip the code that would have called
PageClearAllVisible.

I don't feel particularly competent to propose a patch for this. It
seems to me that
log_heap_update should be sent the correct block in the first place,
and some other
method should be used to communicate between heap_update and log_heap_update
if communication is necessary. But really, I don't think such
communication should be necessary, and the xlrec.all_visible_cleared
and xlrec.new_all_visible_cleared fields are unneeded. Just assume
they are true. It seems like the worst thing that can happen is that
we call PageClearAllVisible when it is already cleared, which is
hardly harmful (the blocks that have redo applied to them are already
dirty, so a spurious clear doesn't cause unneeded IO)

Jeff

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2009-08-25 01:29:57 Re: Bug in date arithmetic
Previous Message Greg Stark 2009-08-25 01:17:35 Re: Slaying the HYPOTamus