Re: [GENERAL] PANIC: heap_update_redo: no block

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: "Alex bahdushka" <bahdushka(at)gmail(dot)com>, "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] PANIC: heap_update_redo: no block
Date: 2006-03-28 03:03:59
Message-ID: 26340.1143515039@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> I think what's happened here is that VACUUM FULL moved the only tuple
>> off page 1 of the relation, then truncated off page 1, and now
>> heap_update_redo is panicking because it can't find page 1 to replay the
>> move. Curious that we've not seen a case like this before, because it
>> seems like a generic hazard for WAL replay.

> This sounds familiar
> http://archives.postgresql.org/pgsql-hackers/2005-05/msg01369.php

After further review I've concluded that there is not a systemic bug
here, but there are several nearby local bugs. The reason it's not
a systemic bug is that this scenario is supposed to be handled by the
same mechanism that prevents torn-page writes: the first XLOG record
that touches a given page after a checkpoint is supposed to rewrite
the entire page, rather than update it incrementally. Since XLOG replay
always begins at a checkpoint, this means we should always be able to
write a fresh copy of the page, even after relation deletion or
truncation. Furthermore, during XLOG replay we are willing to create
a table (or even a whole tablespace or database directory) if it's not
there when touched. The subsequent replay of the deletion or truncation
will get rid of any unwanted data again.

Therefore, there is no systemic bug --- unless you are running with
full_page_writes=off. I assert that that GUC variable is broken and
must be removed.

There are, however, a bunch of local bugs, including these:

* On a symlink-less platform (ie, Windows), TablespaceCreateDbspace is
#ifdef'd to be a no-op. This is wrong because it performs the essential
function of re-creating a tablespace or database directory if needed
during replay. AFAICS the #if can just be removed and have the same
code with or without symlinks.

* log_heap_update decides that it can set XLOG_HEAP_INIT_PAGE instead
of storing the full destination page, if the destination contains only
the single tuple being moved. This is fine, except it also resets the
buffer indicator for the *source* page, which is wrong --- that page
may still need to be re-generated from the xlog record. This is the
proximate cause of the bug report that started this thread.

* btree_xlog_split passes extend=false to XLogReadBuffer for the left
sibling, which is silly because it is going to rewrite that whole page
from the xlog record anyway. It should pass true so that there's no
complaint if the left sib page was later truncated away. This accounts
for one of the bug reports mentioned in the message cited above.

* btree_xlog_delete_page passes extend=false for the target page,
which is likewise silly because it's going to init the page (not that
there was any useful data on it anyway). This accounts for the other
bug report mentioned in the message cited above.

Clearly, we need to go through the xlog code with a fine tooth comb
and convince ourselves that all pages touched by any xlog record will
be properly reconstituted if they've later been truncated off. I have
not yet examined any of the code except the above.

Notice that these are each, individually, pretty low-probability
scenarios, which is why we've not seen many bug reports. If we had had
a systemic bug I'm sure we'd be seeing far more.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Fetter 2006-03-28 03:23:50 Re: Data model question regarding usage of arrays.
Previous Message John DeSoi 2006-03-28 02:53:44 Re: [Bulk] General advice on database/web applications

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Yarra 2006-03-28 06:12:26 Tablespaces oddity?
Previous Message Andrew Dunstan 2006-03-28 00:13:27 Re: Why are default encoding conversions namespace-specific?