Re: Making autovacuum logs indicate if insert-based threshold was the triggering condition

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Making autovacuum logs indicate if insert-based threshold was the triggering condition
Date: 2022-08-06 22:41:57
Message-ID: CAH2-Wzkr+EhpQUD4G7bX1=V1CEjVNWOUXWO23EBF8KOHEcKQ=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 6, 2022 at 2:50 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> This sounded familiar, and it seems like I anticipated that it might be an
> issue. Here, I was advocating for the new insert-based GUCs to default to -1,
> to have insert-based autovacuum fall back to the thresholds specified by the
> pre-existing GUCs (20% + 50), which would (in my proposal) remain be the normal
> way to tune any type of vacuum.
>
> https://www.postgresql.org/message-id/20200317233218.GD26184@telsasoft.com
>
> I haven't heard of anyone who had trouble setting the necessary GUC, but I'm
> not surprised if most postgres installations are running versions before 13.

ISTM that having insert-based triggering conditions is definitely a
good idea, but what we have right now still has problems. It currently
won't work very well unless the user goes out of their way to tune
freezing to do the right thing. Typically we miss out on the
opportunity to freeze early, because without sophisticated
intervention from the user there is only a slim chance of *any*
freezing taking place outside of the inevitable antiwraparound
autovacuum.

> > Note that a VACUUM that is an "automatic vacuum for inserted tuples" cannot
> > [...] also be a "regular" autovacuum/VACUUM
>
> Why not ?

Well, autovacuum.c should have (and/or kind of already has) 3
different triggering conditions. These are mutually exclusive
conditions -- technically autovacuum.c always launches an autovacuum
against a table because exactly 1 of the 3 thresholds were crossed. My
patch makes sure that it always gives exactly one reason why
autovacuum.c decided to VACUUM, so by definition there is only one
relevant piece of information for vacuumlazy.c to report in the log.
That's fairly simple and high level, and presumably something that
users won't have much trouble understanding.

Right now antiwraparound autovacuum "implies aggressive", in that it
almost always makes vacuumlazy.c use aggressive mode, but this seems
totally arbitrary to me -- they don't have to be virtually synonymous.
I think that antiwraparound autovacuum could even be rebranded as "an
autovacuum that takes place because the table hasn't had one in a long
time". This is much less scary, and makes it clearer that autovacuum.c
shouldn't be expected to really understand what will turn out to be
important "at runtime". That's the time to make important decisions
about what work to do -- when we actually have accurate information.

My antiwraparound example is just that: an example. There is a broader
idea: we shouldn't be too confident that the exact triggering
condition autovacuum.c applied to launch an autovacuum worker turns
out to be the best reason to VACUUM, or even a good reason --
vacuumlazy.c should be able to cope with that. The user is kept in the
loop about both, by reporting the triggering condition and the details
of what really happened at runtime. Maybe lazyvacuum.c can be taught
to speed up and slow down based on the conditions it observes as it
scans the heap -- there are many possibilities.

This broader idea is pretty much what you were getting at with your
example, I think.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-06 22:42:03 Re: Cleaning up historical portability baggage
Previous Message Thomas Munro 2022-08-06 22:23:17 Re: Cleaning up historical portability baggage