Re: do only critical work during single-user vacuum?

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "Bossart, Nathan" <bossartn(at)amazon(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: do only critical work during single-user vacuum?
Date: 2022-01-12 00:58:56
Message-ID: CAFBsxsEoYaGVFq+7UeW2kZzO8pCAOTUB+j5RNw-Hm-w_Xo97Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 21, 2021 at 4:56 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> But if we're going to add a new option to the VACUUM command (or
> something of similar scope), then we might as well add a new behavior
> that is reasonably exact -- something that (say) only *starts* a
> VACUUM for those tables whose relfrozenxid age currently exceeds half
> the autovacuum_freeze_max_age for the table (usually taken from the
> GUC, sometimes taken from the reloption), which also forces the
> failsafe. And with similar handling for
> relminmxid/autovacuum_multixact_freeze_max_age.

> This new command/facility should probably not be a new flag to the
> VACUUM command, as such. Rather, I think that it should either be an
> SQL-callable function, or a dedicated top-level command (that doesn't
> accept any tables). The only reason to have this is for scenarios
> where the user is already in a tough spot with wraparound failure,
> like that client of yours. Nobody wants to force the failsafe for one
> specific table. It's not general purpose, at all, and shouldn't claim
> to be.

I've attached a PoC *untested* patch to show what it would look like
as a top-level statement. If the "shape" is uncontroversial, I'll put
work into testing it and fleshing it out.

For the PoC I wanted to try re-using existing keywords. I went with
"VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
a table name. It also brings "wraparound limit" to mind. We could add
a single-use unreserved keyword (such as VACUUM_MINIMAL or
VACUUM_FAST), but that doesn't seem great.

> In other words, while triggering the failsafe is important, simply *not
> starting* VACUUM for relations where there is really no need for it is
> at least as important. We shouldn't even think about pruning or
> freezing with these tables. (ISTM that the only thing that might be a
> bit controversial about any of this is my definition of "safe", which
> seems like about the right trade-off to me.)

I'm not sure what the right trade-off is, but as written I used 95% of
max age. It might be undesirable to end up so close to kicking off
uninterruptible vacuums, but the point is to get out of single-user
mode and back to streaming WAL as quickly as possible. It might also
be worth overriding the min ages as well, but haven't done so here.

It can be executed in normal mode (although it's not expected to be),
which makes testing easier and allows for a future possibility of not
requiring shutdown at all, by e.g. terminating non-superuser
connections.

--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-vacuum-select-tables-closest-to-wraparound.patch text/x-patch 10.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-01-12 01:18:05 Re: [Ext:] Re: Stream Replication not working
Previous Message Alvaro Herrera 2022-01-12 00:41:28 Re: Column Filtering in Logical Replication