Re: New strategies for freezing, advancing relfrozenxid early

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-27 01:15:10
Message-ID: 20230127011510.ta3kpomnkzxz5nwg@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-26 14:27:53 -0500, Robert Haas wrote:
> One idea that I've had about how to solve this problem is to try to
> make vacuum try to aggressively freeze some portion of the table on
> each pass, and to behave less aggressively on the rest of the table so
> that, hopefully, no single vacuum does too much work.

I agree that this rough direction is worthwhile to purse.

> Unfortunately, I don't really know how to do that effectively. If we knew
> that the table was going to see 10 vacuums before we hit
> autovacuum_freeze_max_age, we could try to have each one do 10% of the
> amount of freezing that was going to need to be done rather than letting any
> single vacuum do all of it, but we don't have that sort of information.

I think, quite fundamentally, it's not possible to bound the amount of work an
anti-wraparound vacuum has to do if we don't have an age based autovacuum
trigger kicking in before autovacuum_freeze_max_age. After all, there might be
no autovacuum before that's autovacuum_freeze_max_age is reached.

But there's just no reason to not have a trigger below
autovacuum_freeze_max_age. That's why I think Peter's patch to split age and
anti-"auto-cancel" autovacuums is an strictly necessary change if we want to
make autovacuum fundamentally suck less. There's a few boring details to
figure out how to set/compute those limits, but I don't think there's anything
fundamentally hard.

I think we also need the number of all-frozen pages in pg_class if we want to
make better scheduling decision. As we already compute the number of
all-visible pages at the end of vacuuming, we can compute the number of
all-frozen pages as well. The space for another integer in pg_class doesn't
bother me one bit.

Let's say we had a autovacuum_vacuum_age trigger of 100m, and
autovacuum_freeze_max_age=500m. We know that we're roughly going to be
vacuuming 5 times before reaching autovacuum_freeze_max_age (very slow
autovacuums are an issue, but if one autovacuum takes 100m+ xids long, there's
not much we can do).

With that we could determine the eager percentage along the lines of:
frozen_target = Min(age(relfrozenxid), autovacuum_freeze_max_age)/autovacuum_freeze_max_age
eager_percentage = Min(0, frozen_target * relpages - pg_class.relallfrozen * relpages)

One thing I don't know fully how to handle is how to ensure that we try to
freeze a different part of the table each vacuum. I guess we could store a
page number in pgstats?

This would help address the "cliff" issue of reaching
autovacuum_freeze_max_age. What it would *not*, on its own, would is the
number of times we rewrite pages.

I can guess at a few ways to heuristically identify when tables are "append
mostly" from vacuum's view (a table can be update heavy, but very localized to
recent rows, and still be append mostly from vacuum's view). There's obvious
cases, e.g. when there are way more inserts than dead rows. But other cases
are harder.

> Also, even if we did have that sort of information, the idea only works if
> the pages that we freeze sooner are ones that we're not about to update or
> delete again, and we don't have any idea what is likely there.

Perhaps we could use something like
(age(relfrozenxid) - age(newest_xid_on_page)) / age(relfrozenxid)
as a heuristic?

I have a gut feeling that we should somehow collect/use statistics about the
number of frozen pages, marked as such by the last (or recent?) vacuum, that
had to be "defrosted" by backends. But I don't quite know how to yet. I think
we could collect statistics about that by storing the LSN of the last vacuum
in the shared stats, and incrementing that counter when defrosting.

A lot of things like that would work a whole lot better if we had statistics
that take older data into account, but weigh it less than more recent
data. But that's hard/expensive to collect.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-01-27 01:23:55 Re: Add LZ4 compression in pg_dump
Previous Message Thomas Munro 2023-01-27 00:46:09 Re: Something is wrong with wal_compression