Re: do only critical work during single-user vacuum?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: do only critical work during single-user vacuum?
Date: 2022-02-04 02:08:03
Message-ID: CA+TgmoZ8=xgo6cZi2a_ig27=FrSEs3zXRaGtLn1qM7RhNrS6sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 3, 2022 at 8:35 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Yea, I'd have no problem leaving the "hard" limit somewhere closer to 1
> million (although 100k should be just as well), but introduce a softer "only
> vacuum/drop/truncate" limit a good bit before that.

+1.

> To address the "as long as" part: I think that describing better what is
> holding back the horizon would be a significant usability improvement.
>
> Imagine that instead of the generic hints in these messages:
> ereport(ERROR,
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> errmsg("database is not accepting commands to avoid wraparound data loss in database \"%s\"",
> oldest_datname),
> errhint("Stop the postmaster and vacuum that database in single-user mode.\n"
> "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
> and
> ereport(WARNING,
> (errmsg("oldest xmin is far in the past"),
> errhint("Close open transactions soon to avoid wraparound problems.\n"
> "You might also need to commit or roll back old prepared transactions, or drop stale replication slots.")));
>
> we'd actually tell the user a bit more what about what is causing the
> problem.
>
> We can compute the:
> 1) oldest slot by xmin, with name
> 2) oldest walsender by xmin, with pid
> 3) oldest prepared transaction id by xid / xmin, with name
> 4) oldest in-progress transaction id by xid / xmin, with name
> 5) oldest database datfrozenxid, with database name
>
> If 1-4) are close to 5), there's no point in trying to vacuum aggressively, it
> won't help. So we instead can say that the xmin horizon (with a better name)
> is held back by the oldest of these, with enough identifying information for
> the user to actually know where to look.

Yes. This kind of thing strikes me as potentially a huge help. To
rephrase that in other terms, we could tell the user what the actual
problem is instead of suggesting to them that they shut down their
database just for fun. It's "just for fun" because (a) it typically
won't fix the real problem, which is most often (1) or (3) from your
list, and even if it's (2) or (4) they could just kill the session
instead of shutting down the whole database, and (b) no matter what
needs to be done, whether it's VACUUM or ROLLBACK PREPARED or
something else, they may as well do that thing in multi-user mode
rather than single-user mode, unless we as PostgreSQL developers
forgot to make that actually work.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-02-04 02:24:08 Re: Windows now has fdatasync()
Previous Message Fujii Masao 2022-02-04 01:59:04 Re: Add checkpoint and redo LSN to LogCheckpointEnd log message