Re: New vacuum option to do only freezing

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New vacuum option to do only freezing
Date: 2019-05-02 16:02:59
Message-ID: CA+TgmobjyLAkSrGS3xM9fX9hDWe886YTxHYB30CO-FqzseMC0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 2, 2019 at 10:28 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> It'd be good if somebody could make a pass over the safety mechanisms in
> heap_prepare_freeze_tuple(). I added them at some point, after a data
> corrupting bug related to freezing, but they really were more intended
> as a secondary layer of defense, rather than the primary one. My
> understanding is still that we assume that we never should reach
> heap_prepare_freeze_tuple() for something that is below the horizon,
> even after this change, right?

I think so. This code is hit if the tuple wasn't dead yet at the time
that heap_page_prune() decided not to prune it, but it is dead by the
time we retest the tuple status. But the same value of OldestXmin was
used for both checks, so the only way that can really happen is if the
XID in the tuple header was running before and is no longer running.
However, if the XID was running at the time that heap_page_prune()
ran, than OldestXmin certainly can't be newer than that XID. And
therefore the value we're intending to set for relfrozenxid has surely
got to be older, so we could hardly prune using OldestXmin as the
threshold and then relfrozenxid to a newer XID.

Actually, I now believe that my original concern here was exactly
backwards. Prior to the logic change in this commit, with index
vacuum disabled, a tuple that becomes dead between the
heap_page_prune() check and the lazy_scan_heap() check would have
followed the tupgone = true path. That would cause it to be treated
as if the second vacuum pass were going to remove it - i.e. not
frozen. But with index cleanup disabled, there will never be a second
vacuum pass. So a tuple that was actually being kept was not sent to
heap_prepare_freeze_tuple() with, basically, no justification.
Imagine for example that the tuple was not old in terms of its XID
age, but its MXID age was somehow ancient. Then we'd fail to freeze
it on the theory that it was going to be removed, but yet not remove
it. Oops. The revised logic - which is as per Tom's suggestion -
does the right thing, which is treat the tuple as one we've chosen to
keep.

While looking at this code, I think I may have spotted another bug, or
at least a near-bug. lazy_record_dead_tuple() thinks it's OK to just
forget about dead tuples if there's not enough memory, which I think
is OK in the normal case where the dead tuple has been truncated to a
dead line pointer. But in the tupgone = true case it's NOT ok,
because in that case we're leaving behind not just a dead line pointer
but an actual tuple which we have declined to freeze on the assumption
that it will be removed later. But if lazy_record_dead_tuple()
forgets about it, then it won't be removed later. That could lead to
tuples remaining that precede the freeze horizons. The reason why
this may be only a near-bug is that we seem to try pretty hard to make
sure that we'll never call that function in the first place without
enough space being available. Still, it seems to me that it would be
safer to change the code to look like:

if (vacrelstats->num_dead_tuples >= vacrelstats->max_dead_tuples)
elog(ERROR, "oh crap");

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-05-02 16:03:35 Re: New vacuum option to do only freezing
Previous Message Tom Lane 2019-05-02 16:02:44 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6