Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Sergey Konoplev <gray(dot)ru(at)gmail(dot)com>, matioli(dot)matheus(at)gmail(dot)com, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>, Максим Панченко <Panchenko(at)gw(dot)tander(dot)ru>, Сизов Сергей Павлович <sizov_sp(at)gw(dot)tander(dot)ru>
Subject: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Date: 2014-01-14 01:59:48
Message-ID: 26165.1389664788@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

BTW, while I'm looking at this ... the writing side of the code seems a
few bricks shy of a load too:

/*
* InHotStandby we need to scan right up to the end of the index for
* correct locking, so we may need to write a WAL record for the final
* block in the index if it was not vacuumed. It's possible that VACUUMing
* has actually removed zeroed pages at the end of the index so we need to
* take care to issue the record for last actual block and not for the
* last block that was scanned. Ignore empty indexes.
*/
if (XLogStandbyInfoActive() &&
num_pages > 1 && vstate.lastBlockVacuumed < (num_pages - 1))
{
Buffer buf;

/*
* We can't use _bt_getbuf() here because it always applies
* _bt_checkpage(), which will barf on an all-zero page. We want to
* recycle all-zero pages, not fail. Also, we want to use a
* nondefault buffer access strategy.
*/
buf = ReadBufferExtended(rel, MAIN_FORKNUM, num_pages - 1, RBM_NORMAL,
info->strategy);
LockBufferForCleanup(buf);
_bt_delitems_vacuum(rel, buf, NULL, 0, vstate.lastBlockVacuumed);
_bt_relbuf(rel, buf);
}

If the last physical page of the index is all-zero, the preceding loop
won't have any problem with that (nor should it); but this code sure will
have a problem, because _bt_delitems_vacuum isn't prepared to cope with an
all-zero page AFAICS.

Nor am I following the logic of the initial comment. If we know that
pages above N are all-zero, why are we worried about taking locks on them?
There shouldn't be any scans reaching such pages, and any that did would
error out anyhow in _bt_getbuf().

ISTM the right thing here is for the btvacuumpage loop to remember the
last ordinary, valid index page it saw, and point to that one in this
added WAL entry.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Umesh Kirdat 2014-01-14 06:50:03 Re: Duplicate key violates unique constraint
Previous Message Andres Freund 2014-01-14 01:18:51 Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-14 02:19:59 Re: Where do we stand on 9.3 bugs?
Previous Message Josh Berkus 2014-01-14 01:56:05 Re: Linux kernel impact on PostgreSQL performance