Re: Improving the "Routine Vacuuming" docs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improving the "Routine Vacuuming" docs
Date: 2022-04-13 20:24:54
Message-ID: CA+TgmoaxUvkR=ACNQvN=7Xkm8t+Q+FmW70GkYtxF_VupEW78=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 13, 2022 at 12:34 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> What do you think of the idea of relating freezing to removing tuples
> by VACUUM at this point? This would be a basis for explaining how
> freezing and tuple removal are constrained by the same cutoff. A very
> old snapshot can hold up cleanup, but it can also hold up freezing to
> the same degree (it's just not as obvious because we are less eager
> about freezing by default).

I think something like that could be useful, if we can find a way to
word it sufficiently clearly.

> Perhaps we can agree on some (or even all) of the following specific points:
>
> * We shouldn't mention "4 billion XIDs" at all.
>
> * We should say that the issue is an issue of distances between
> unfrozen XIDs. The maximum distance that can ever be allowed to emerge
> between any two unfrozen XIDs in a cluster is about 2 billion XIDs.
>
> * We don't need to say anything about how XIDs are compared, normal vs
> permanent XIDs, etc.
>
> * The system takes drastic intervention to prevent this implementation
> restriction from becoming a problem, starting with anti-wraparound
> autovacuums. Then there's the failsafe. Finally, there's the
> xidStopLimit mechanism, our last line of defense.

Those all sound pretty reasonable. There's a little bit of doubt in my
mind about the third one; I think it could possibly be useful to
explain that the XID space is circular and 0-2 are special, but maybe
not.

> > I think it is wrong to conflate wraparound with xidStopLimit.
> > xidStopLimit is the final defense against an actual wraparound, and
> > like I say, an actual wraparound is quite possible if you put the
> > system in single user mode and then do something like this:
>
> I forget to emphasize one aspect of the problem that seems quite
> important: the document itself seems to conflate the xidStopLimit
> mechanism with true wraparound. At least I thought so. Last year's
> thread on this subject ('What is "wraparound failure", really?') was
> mostly about that confusion. I personally found that very confusing,
> and I doubt that I'm the only one.

OK.

> There is no good reason to use single user mode anymore (a related
> problem with the docs is that we still haven't made that point). And

Agreed.

> the pg_upgrade bug that led to invalid relfrozenxid values was
> flagrantly just a bug (adding a WARNING for this recently, in commit
> e83ebfe6). So while I accept that the distinction you're making here
> is valid, maybe we can fix the single user mode doc bug too, removing
> the need to discuss "true wraparound" as a general phenomenon. You
> shouldn't ever see it in practice anymore. If you do then either
> you've done something that "invalidated the warranty", or you've run
> into a legitimate bug.

I think it is probably important to discuss this, but along the lines
of: it is possible to bypass all of these safeguards and cause a true
wraparound by running in single-user mode. Don't do that. There's no
wraparound situation that can't be addressed just fine in multi-user
mode, and here's how to do that. In previous releases, we used to
sometimes recommend single user mode, but that's no longer necessary
and not a good idea, so steer clear.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-04-13 20:28:19 Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Previous Message Euler Taveira 2022-04-13 19:59:30 Re: PG DOCS - logical replication filtering