Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date: 2023-01-17 18:33:21
Message-ID: 20230117183321.2t45rko5snr3d4zt@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-17 10:26:52 -0500, Robert Haas wrote:
> On Mon, Jan 16, 2023 at 11:11 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> > On Mon, Jan 16, 2023 at 8:25 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > > I really dislike formulas like Min(freeze_max_age * 2, 1 billion).
> > > That looks completely magical from a user perspective. Some users
> > > aren't going to understand autovacuum behavior at all. Some will, and
> > > will be able to compare age(relfrozenxid) against
> > > autovacuum_freeze_max_age. Very few people are going to think to
> > > compare age(relfrozenxid) against some formula based on
> > > autovacuum_freeze_max_age. I guess if we document it, maybe they will.
> >
> > What do you think of Andres' autovacuum_no_auto_cancel_age proposal?
>
> I like it better than your proposal. I don't think it's a fundamental
> improvement and I would rather see a fundamental improvement, but I
> can see it being better than nothing.

That's similar to my feelings about it.

I do think it'll be operationally nice to have at least some window where an
autovacuum is triggered due to age and where it won't prevent cancels. In many
situations it'll likely suffice if that window is autovacuum_naptime *
xids_per_sec large, but of course that's easily enough exceeded.

> > > I do like the idea of driving the auto-cancel behavior off of the
> > > results of previous attempts to vacuum the table. That could be done
> > > independently of the XID age of the table.
> >
> > Even when the XID age of the table has already significantly surpassed
> > autovacuum_freeze_max_age, say due to autovacuum worker starvation?
> >
> > > If we've failed to vacuum
> > > the table, say, 10 times, because we kept auto-cancelling, it's
> > > probably appropriate to force the issue.
> >
> > I suggested 1000 times upthread. 10 times seems very low, at least if
> > "number of times cancelled" is the sole criterion, without any
> > attention paid to relfrozenxid age or some other tiebreaker.
>
> Hmm, I think that a threshold of 1000 is far too high to do much good.

Agreed.

> By the time we've tried to vacuum a table 1000 times and failed every
> time, I anticipate that the situation will be pretty dire, regardless
> of why we thought the table needed to be vacuumed in the first place.

Agreed.

> In the best case, with autovacum_naptime=1minute, failing 1000 times
> means that we've delayed vacuuming the table for at least 16 hours.

Perhaps it'd make sense for an auto-cancelled worker to signal the launcher to
do a cycle of vacuuming? Or even to just try to vacuum the table again
immediately? After all, we know that the table is going to be on the schedule
of the next worker immediately. Of course we shouldn't retry indefinitely, but
...

> In fact I think there's a decent argument that a threshold of ten is
> possibly too high here, too. If you wait until the tenth try before
> you try not auto-cancelling, then a table with a workload that makes
> auto-cancelling 100% probable will get vacuumed 10% as often as it
> would otherwise. I think there are cases where that would be OK, but
> probably on the whole it's not going to go very well.

That's already kind of the case - we'll only block auto-cancelling when
exceeding autovacuum_freeze_max_age, all the other autovacuums will be
cancelable.

> The only problem I see with lowering the threshold below ~10 is that the
> signal starts to get weak. If something fails for the same reason ten times
> in a row you can be pretty sure it's a chronic problem. If you made the
> thereshold say three you'd probably start making bad decisions sometimes --
> you'd think that you had a chronic problem when really you just got a bit
> unlucky.

Yea. Schema migrations in prod databases typically have to run in
single-statement or very small transactions, for obvious reasons. Needing to
lock the same table exclusively a few times during a schema migration is
pretty normal, particularly when foreign keys are involved. Getting blocked by
autovacuum in the middle of a schema migration is NASTY.

This is why I'm a bit worried that 10 might be too low... It's not absurd for
a schema migration to create 10 new tables referencing an existing table in
need of vacuuming.

Perhaps we should track when the first failure was, and take that into
account? Clearly having all 10 autovacuums on the same table cancelled is
different when those 10 cancellations happened in the last 10 *
autovacuum_naptime minutes, than when the last successful autovacuum was hours
ago.

> To get back to the earlier question above, I think that if the
> retries-before-not-auto-cancelling threshold were low enough to be
> effective, you wouldn't necessarily need to consider XID age as a
> second reason for not auto-cancelling. You would want to force the
> behavior anyway when you hit emergency mode, because that should force
> all the mitigations we have, but I don't know that you need to do
> anything before that.

Hm, without further restrictions, that has me worried. It's not crazy to have
a LOCK TABLE on a small-ish table be part of your workload - I've certainly
seen it plenty of times. Suddenly blocking on that for a few minutes, just
because a bit of bloat has collected, seems likely to cause havoc.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-01-17 18:42:30 Re: almost-super-user problems that we haven't fixed yet
Previous Message Nathan Bossart 2023-01-17 18:23:56 Re: recovery modules