Re: PageIsAllVisible()'s trustworthiness in Hot Standby

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PageIsAllVisible()'s trustworthiness in Hot Standby
Date: 2012-12-11 20:05:06
Message-ID: CA+TgmobdH9sOgpdJ30w8K4Di-TNizRnWXhP5+9aL70H=eKJ-jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 4, 2012 at 12:10 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:
> Hmm. Yeah, I do not have guts to prove that either. I'll probably write up a
> comment for your consideration to explain why we don't trust PD_ALL_VISIBLE
> in Hot standby for seq scans, but still trust VM for index-only scans.

Sure.

> Another piece of code that caught my attention is this (sorry for digging up
> topics that probably have been discussed to death before and I might not
> have paid attention to then):
>
> Whenever we clear PD_ALL_VISIBLE flag, we do that while holding the heap
> page lock and also WAL log that action (in fact, piggy-back with
> insert/update/delete). The only exception that I saw is in lazy_scan_heap()
>
> 916 else if (PageIsAllVisible(page) && has_dead_tuples)
> 917 {
> 918 elog(WARNING, "page containing dead tuples is marked as
> all-visible in relation \"%s\" page %u",
> 919 relname, blkno);
> 920 PageClearAllVisible(page);
> 921 MarkBufferDirty(buf);
> 922 visibilitymap_clear(onerel, blkno, vmbuffer);
> 923 }
>
> Obviously, this is not a case that we expect to occur. But if it does occur
> for whatever reason, then even though we will correct the page flag on
> master, this would never be populated to the standby. So the standby may
> continue to have that tiny bit set on the page, at least until another
> insert/update/delete happens on the page. Not sure if there is anything to
> worry about, but looks a bit strange. I looked at the archive but can't see
> any report of the warning, so may be we are safe after all.

The text of that warning message was different in previous releases,
and I have seen the older text come up. I don't really want to invent
a new WAL record type just to cover that case, because that seems like
it's more likely to cause problems than to fix them. We could
consider emitting an XLOG_HEAP_NEWPAGE record, perhaps. I'm not sure
whether it's worth it, though. This is only going to arise
(hopefully) if your master is corrupted - that little hunk of code is
really a $0.02 solution to a $1.00 problem. I don't know that we want
to pretend that it's any more than a band-aid.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-12-11 20:07:29 Re: Multiple --table options for other commands
Previous Message Pavel Stehule 2012-12-11 19:34:32 Re: skipping context for RAISE statements - maybe obsolete?