Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Subject: Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound
Date: 2023-05-01 12:33:52
Message-ID: CAFBsxsFQrDecSOVNeuN0Ay3bJCVp7rVFCBVXMNNNj4UwyGTBHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 1, 2023 at 2:30 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sat, Apr 29, 2023 at 7:30 PM John Naylor
> <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> > How about
> >
> > -HINT: To avoid a database shutdown, [...]
> > +HINT: To prevent XID exhaustion, [...]
> >
> > ...and "MXID", respectively? We could explain in the docs that vacuum
and read-only queries still work "when XIDs have been exhausted", etc.
>
> I think that that particular wording works in this example -- we *are*
> avoiding XID exhaustion. But it still doesn't really address my
> concern -- at least not on its own. I think that we need a term for
> xidStopLimit mode (and perhaps multiStopLimit) itself. This is a
> discrete state/mode that is associated with a specific mechanism.

Well, since you have a placeholder "xidStopLimit mode" in your other patch,
I'll confine my response to there. Inventing "modes" seems like an awkward
thing to backpatch, not to mention moving the goalposts. My modest goal
here is quite limited: to stop lying to our users about "not accepting
commands", and change tragically awful advice into sensible advice.

Here's my new idea:

-HINT: To avoid a database shutdown, [...]
+HINT: To prevent XID generation failure, [...]

Actually, I like "allocation" better, but the v8 patch now has "generation"
simply because one MXID message already has "generate" and I did it that
way before thinking too hard. I'd be okay with either one as long as it's
consistent.

> > (I should probably also add in the commit message that the "shutdown"
in the message was carried over to MXIDs when they arrived also in 2005).

Done

> > > Separately, there is a need to update a couple of other places to use
> > > this new terminology. The documentation for vacuum_sailsafe_age and
> > > vacuum_multixact_failsafe_age refer to "system-wide transaction ID
> > > wraparound failure", which seems less than ideal, even without your
> > > patch.
> >
> > Right, I'll have a look.

Looking now, I'm even less inclined to invent new terminology in back
branches.

> As you know, there is a more general problem with the use of the term
> "wraparound" in the docs, and in the system itself (in places like
> pg_stat_activity). Even the very basic terminology in this area is
> needlessly scary. Terms like "VACUUM (to prevent wraparound)" are
> uncomfortably close to "a race against time to avoid data corruption".
> The system isn't ever supposed to corrupt data, even if misconfigured
> (unless the misconfiguration is very low-level, such as "fsync=off").
> Users should be able to take that much for granted.

Granted. Whatever form your rewrite ends up in, it could make a lot of
sense to then backpatch a few localized corrections. I wouldn't even object
to including a few substitutions of s/wraparound failure/allocation
failure/ where appropriate. Let's see how that shakes out first.

> > I think the docs would do well to have ordered steps for recovering
from both XID and MXID exhaustion.
>
> I had planned to address this with my ongoing work on the "Routine
> Vacuuming" docs, but I think that you're right about the necessity of
> addressing it as part of this patch.

0003 is now a quick-and-dirty attempt at that, only in the docs. The MXID
part is mostly copy-pasted from the XID part, just to get something
together. I'd like to abbreviate that somehow.
--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v8-0002-Stop-telling-users-to-run-VACUUM-in-a-single-user.patch text/x-patch 6.7 KB
v8-0001-Correct-outdated-docs-and-messages-regarding-XID-.patch text/x-patch 8.6 KB
v8-0003-Rough-draft-of-complete-steps-to-recover-from-M-X.patch text/x-patch 3.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-05-01 15:03:07 Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
Previous Message Oliver Ford 2023-05-01 11:57:38 Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options