Re: crash-safe visibility map, take three

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: crash-safe visibility map, take three
Date: 2010-12-01 17:31:36
Message-ID: AANLkTinjMjut9Xv2RV8W=8tUAsk+hCEj0QXNA1s=EdmE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2010 at 12:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think we can improve this a bit further by also introducing a
>> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
>> FrozenXID.  This allows us to freeze tuples aggressively - if we want
>> - without losing any forensic information.
>
> So far so good ...
>
>> We can then modify the
>> above algorithm slightly, so that when we observe that a page is all
>> visible, we not only set PD_ALL_VISIBLE on the page but also
>> HEAP_XMIN_FROZEN on each tuple.  The WAL record marking the page as
>> all-visible then doubles as a WAL record marking it frozen,
>> eliminating the need to dirty the page yet again at anti-wraparound
>> vacuum time.
>
> but this seems a lot more dubious/fragile.  The basic problem is that
> it's not clear whether HEAP_XMIN_FROZEN is a hint bit or essential
> data.  If you want to set it without the overhead of an LSN bump or a
> possible FPI in WAL, then it's a hint bit.  But if you're using it to
> protect clog truncation then it's essential data.  Perhaps you can make
> this work but there are some nonobvious requirements:
>
> 1. Seeing PD_ALL_VISIBLE set does not excuse vacuum from having to
> iterate through all the tuples on the page checking for
> HEAP_XMIN_FROZEN.  This is because the non-logged update of the page
> might have been torn on the way to disk, such that PD_ALL_VISIBLE got
> set but not all of the FROZEN bits did.

Good point. If we see the bit set in the visibility map set, it
should be safe to infer that the PD_ALL_VISIBLE bit and all
HEAP_XMIN_FROZEN bits are set. But if the visibility map bit is NOT
set, we must check PD_ALL_VISIBLE and, whether it's set or not, each
individual HEAP_XMIN_FROZEN bit.

> 2. During an anti-wraparound vacuum, you *need to* emit a WAL record
> when setting HEAP_XMIN_FROZEN.  It's not a hint, any more than writing
> FrozenXID is now.
>
> Actually, #2 isn't even good enough.  What if vacuum passes over a page
> and finds all the FROZEN bits set, but the reason they're set is that
> somebody else updated them in hint fashion microseconds before?  It
> seems possible that those bits might not make it to disk before a
> subsequent crash.  The only way to be really sure those bits are set is
> to emit a WAL record that says to set them, whether or not they seem to
> be set already.  While the WAL record could be small, you'd need one for
> every page, making the argument that this saves I/O somewhat dubious.

I think that we would only ever allow HEAP_XMIN_FROZEN to be set as
part of a WAL-logged operation. Either we are marking the page
all-visible - in which case we're emitting the new WAL record type
XLOG_HEAP_ALLVISIBLE - or we're freezing individual tuples on a page
where very old and very new tuples are intermixed - in which case we
emit the existing XLOG_HEAP2_FREEZE.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-12-01 17:44:03 Re: improving foreign key locks
Previous Message Robert Haas 2010-12-01 17:23:50 Re: crash-safe visibility map, take three