Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeremy Schneider <schnjere(at)amazon(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2022-08-25 23:23:09
Message-ID: CAH2-Wzmu8B+exBbV49Hm2Wstf_nmsRXdaFZ60k+Du7D=G4OGKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 25, 2022 at 3:35 PM Jeremy Schneider <schnjere(at)amazon(dot)com> wrote:
> We should be careful here. IIUC, the current autovac behavior helps
> bound the "spread" or range of active multixact IDs in the system, which
> directly determines the number of distinct pages that contain those
> multixacts. If the proposed change herein causes the spread/range of
> MXIDs to significantly increase, then it will increase the number of
> blocks and increase the probability of thrashing on the SLRUs for these
> data structures.

As a general rule VACUUM will tend to do more eager freezing with the
patch set compared to HEAD, though it should never do less eager
freezing. Not even in corner cases -- never.

With the patch, VACUUM pretty much uses the most aggressive possible
XID-wise/MXID-wise cutoffs in almost all cases (though only when we
actually decide to freeze a page at all, which is now a separate
question). The fourth patch in the patch series introduces a very
limited exception, where we use the same cutoffs that we'll always use
on HEAD (FreezeLimit + MultiXactCutoff) instead of the aggressive
variants (OldestXmin and OldestMxact). This isn't just *any* xmax
containing a MultiXact: it's a Multi that contains *some* XIDs that
*need* to go away during the ongoing VACUUM, and others that *cannot*
go away. Oh, and there usually has to be a need to keep two or more
XIDs for this to happen -- if there is only one XID then we can
usually swap xmax with that XID without any fuss.

> PS. see also
> https://www.postgresql.org/message-id/247e3ce4-ae81-d6ad-f54d-7d3e0409a950@ardentperf.com

I think that the problem you describe here is very real, though I
suspect that it needs to be addressed by making opportunistic cleanup
of Multis happen more reliably. Running VACUUM more often just isn't
practical once a table reaches a certain size. In general, any kind of
processing that is time sensitive probably shouldn't be happening
solely during VACUUM -- it's just too risky. VACUUM might take a
relatively long time to get to the affected page. It might not even be
that long in wall clock time or whatever -- just too long to reliably
avoid the problem.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2022-08-26 00:14:51 Re: New strategies for freezing, advancing relfrozenxid early
Previous Message Jeremy Schneider 2022-08-25 22:35:17 Re: New strategies for freezing, advancing relfrozenxid early