Re: We should stop telling users to "vacuum that database in single-user mode"

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Hannu Krosing <hannuk(at)google(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: We should stop telling users to "vacuum that database in single-user mode"
Date: 2021-03-03 10:33:03
Message-ID: CAApHDvqATkE7iptsBmg-UH1AjVsp9SOunsZNiOdTY-JeD7UhUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 3 Mar 2021 at 21:44, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
> On Tue, Mar 2, 2021 at 10:07 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> >
> > On Wed, 3 Mar 2021 at 01:12, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> > >
> > > On Tue, Mar 2, 2021 at 7:52 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > > > I have seen it happen that an instance has a vacuum_cost_limit set and
> > > > someone did start the database in single-user mode, per the advice of
> > > > the error message only to find that the VACUUM took a very long time
> > > > due to the restrictive cost limit. I struggle to imagine why anyone
> > > > wouldn't want the vacuum to run as quickly as possible in that
> > > > situation.
> > >
> > > Multiple instances running on the same hardware and only one of them
> > > being in trouble?
> >
> > You might be right. I'm not saying it's a great idea but thought it
> > was worth considering.
> >
> > We could turn to POLA and ask; what would you be more surprised at; 1)
> > Your database suddenly using more I/O than it had been previously, or;
> > 2) Your database no longer accepting DML.
>
> I think we misunderstand each other. I meant this only as a comment
> about the idea of ignoring the cost limit in single user mode -- that
> is, it's a reason to *want* vacuum to not run as quickly as possible
> in single user mode. I should've trimmed the email better.

I meant to ignore the cost limits if we're within a hundred million or
so of the stopLimit. Per what Hannu mentioned, there does not seem to
be a great need with current versions of PostgreSQL to restart in the
instance in single-user mode. VACUUM still works once we're beyond the
stopLimit. It's just commands that need to generate a new XID that'll
fail with the error message mentioned by Hannu.

> I agree with your other idea, that of kicking in a more aggressive
> autovacuum if it's not dealing with things fast enough. Maybe even on
> an incremental way - that is run with the default, then at another
> threshold drop them to half, and at yet another threshold drop them to
> 0. I agree that pretty much anything is better than forcing the user
> into single user mode.

OK cool. I wondered if it should be reduced incrementally or just
switch off the cost limit completely once we're beyond
ShmemVariableCache->xidStopLimit. If we did want it to be incremental
then if we had say ShmemVariableCache->xidFastVacLimit, which was
about 100 million xids before xidStopLimit, then the code could adjust
the sleep delay down by the percentage through we are from
xidFastVacLimit to xidStopLimit.

However, if we want to keep adjusting the sleep delay then we need to
make that work for vacuums that are running already. We don't want to
call ReadNextTransactionId() too often, but maybe if we did it once
per 10 seconds worth of vacuum_delay_point()s. That way we'd never do
it for vacuums already going at full speed.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-03-03 11:08:04 Re: Why OR-clauses not getting copied into baserestrictinfo of another table whose columns are in the same EquivalenceClass?
Previous Message Daniel Gustafsson 2021-03-03 10:31:22 Re: Disallow SSL compression?