effective_multixact_freeze_max_age issue

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>
Subject: effective_multixact_freeze_max_age issue
Date: 2022-08-02 23:12:18
Message-ID: CAH2-Wzn2hODG0KQNvtFsT-Ka=MrUNHRpQXZ8tRiiC-fudQ_kEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The effective_multixact_freeze_max_age mechanism added by commit
53bb309d2d forces aggressive VACUUMs to take place earlier, as
protection against wraparound of the MultiXact member space SLRU.
There was also a follow-up bugfix several years later -- commit
6bda2af039 -- which made sure that the MXID-wise cutoff used to
determine which MXIDs to freeze in vacuumlazy.c could never exceed
oldestMxact (VACUUM generally cannot freeze MultiXacts that are still
seen as running by somebody according to oldestMxact).

I would like to talk about making the
effective_multixact_freeze_max_age stuff more robust, particularly in
the presence of a long held snapshot that holds things up even as SLRU
space for MultiXact members dwindles. I have to admit that I always
found this part of vacuum_set_xid_limits() confusing. I suspect that
the problem has something to do with how we calculate vacuumlazy.c's
multiXactCutoff (as well as FreezeLimit): vacuum_set_xid_limits() just
subtracts a freezemin value from GetOldestMultiXactId(). This is
confusingly similar (though different in important ways) to the
handling for other related cutoffs that happens nearby. In particular,
we start from ReadNextMultiXactId() (not from GetOldestMultiXactId())
for the cutoff that determines if the VACUUM is going to be
aggressive. I think that this can be fixed -- see the attached patch.

Of course, it wouldn't be safe to allow vacuum_set_xid_limits() to
hand off a multiXactCutoff to vacuumlazy.c that is (for whatever
reason) less than GetOldestMultiXactId()/oldestMxact (the bug fixed by
6bda2af039 involved just such a scenario). But that doesn't seem like
much of a problem to me. We can just handle it directly, as needed.
The attached patch handles it as follows:

/* Compute multiXactCutoff, being careful to generate a valid value */
*multiXactCutoff = nextMXID - mxid_freezemin;
if (*multiXactCutoff < FirstMultiXactId)
*multiXactCutoff = FirstMultiXactId;
/* multiXactCutoff must always be <= oldestMxact */
if (MultiXactIdPrecedes(*oldestMxact, *multiXactCutoff))
*multiXactCutoff = *oldestMxact;

That is, we only need to make sure that the "multiXactCutoff must
always be <= oldestMxact" invariant holds once, by checking for it
directly. The same thing happens with OldestXmin/FreezeLimit. That
seems like a simpler foundation. It's also a lot more logical. Why
should the cutoff for freezing be held back by a long running
transaction, except to the extent that it is strictly necessary to do
so to avoid wrong answers (wrong answers seen by the long running
transaction)?

This allows us to simplify the code that issues a WARNING about
oldestMxact/OldestXmin inside vacuum_set_xid_limits(). Why not
actually test oldestMxact/OldestXmin directly, without worrying about
the limits (multiXactCutoff/FreezeLimit)? That also seems more
logical; there is more to be concerned about than freezing being
blocked when OldestXmin gets very old. Though we still rely on the
autovacuum_freeze_max_age GUC to represent "a wildly unreasonable
number of XIDs for OldestXmin to be held back by", just because that's
still convenient.

--
Peter Geoghegan

Attachment Content-Type Size
v1-0001-Derive-VACUUM-s-FreezeLimit-from-next-transaction.patch application/octet-stream 10.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-08-02 23:43:57 Re: optimize lookups in snapshot [sub]xip arrays
Previous Message Arne Roland 2022-08-02 23:07:03 Re: missing indexes in indexlist with partitioned tables