Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think we can improve this a bit further by also introducing a
> HEAP_XMIN_FROZEN bit that we set in lieu of overwriting XMIN with
> FrozenXID. This allows us to freeze tuples aggressively - if we want
> - without losing any forensic information.
So far so good ...
> We can then modify the
> above algorithm slightly, so that when we observe that a page is all
> visible, we not only set PD_ALL_VISIBLE on the page but also
> HEAP_XMIN_FROZEN on each tuple. The WAL record marking the page as
> all-visible then doubles as a WAL record marking it frozen,
> eliminating the need to dirty the page yet again at anti-wraparound
> vacuum time.
but this seems a lot more dubious/fragile. The basic problem is that
it's not clear whether HEAP_XMIN_FROZEN is a hint bit or essential
data. If you want to set it without the overhead of an LSN bump or a
possible FPI in WAL, then it's a hint bit. But if you're using it to
protect clog truncation then it's essential data. Perhaps you can make
this work but there are some nonobvious requirements:
1. Seeing PD_ALL_VISIBLE set does not excuse vacuum from having to
iterate through all the tuples on the page checking for
HEAP_XMIN_FROZEN. This is because the non-logged update of the page
might have been torn on the way to disk, such that PD_ALL_VISIBLE got
set but not all of the FROZEN bits did.
2. During an anti-wraparound vacuum, you *need to* emit a WAL record
when setting HEAP_XMIN_FROZEN. It's not a hint, any more than writing
FrozenXID is now.
Actually, #2 isn't even good enough. What if vacuum passes over a page
and finds all the FROZEN bits set, but the reason they're set is that
somebody else updated them in hint fashion microseconds before? It
seems possible that those bits might not make it to disk before a
subsequent crash. The only way to be really sure those bits are set is
to emit a WAL record that says to set them, whether or not they seem to
be set already. While the WAL record could be small, you'd need one for
every page, making the argument that this saves I/O somewhat dubious.
regards, tom lane
In response to
pgsql-hackers by date
|Next:||From: aaliya zarrin||Date: 2010-12-01 17:23:09|
|Subject: Re: Hi- How frequently Postgres Poll for trigger file|
|Previous:||From: Florian Pflug||Date: 2010-12-01 17:09:15|
|Subject: Re: improving foreign key locks |