Re: Should we remove vacuum_defer_cleanup_age?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Should we remove vacuum_defer_cleanup_age?
Date: 2023-03-24 21:27:53
Message-ID: CAH2-WzmAhhPZrtKUxYJ+teCkhXst0sLO791+kWmcY+Ux7_A+uw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 18, 2023 at 2:34 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2023-Mar-17, Andres Freund wrote:
>
> > I started writing a test for vacuum_defer_cleanup_age while working on the fix
> > referenced above, but now I am wondering if said energy would be better spent
> > removing vacuum_defer_cleanup_age alltogether.
>
> +1 I agree it's not useful anymore.

+1.

I am suspicious of most of the GUCs whose value is an XID age. It
strikes me as something that is convenient to the implementation, but
not to the user, since there are so many ways that XID age might be a
poor proxy for whatever it is that you really care about in each case.

A theoretical advantage of vacuum_defer_cleanup_age is that it allows
the user to control things in terms of the impact on the primary --
whereas hot_standby_feedback is a mechanism that controls things in
terms of the needs of the standby. In practice this is pretty useless,
but it seems like it might be possible to come up with some other new
mechanism that somehow does this in a way that's truly useful.
Something that allows the user to constrain how far we hold back
conflicts/vacuuming in terms of the *impact* on the primary.

It might be helpful to permit opportunistic cleanup by pruning and
index deletion at some point, but to throttle it when we know it would
violate some soft limit related to hot_standby_feedback. Maybe the
system could prevent the first few attempts at pruning when it
violates the soft limit, or make pruning prune somewhat less
aggressively where there is little advantage to it in terms of
space/tuples freed -- decide on what to do at the very last minute,
based on all available information at that late stage, with the full
context available. The system could be taught to be very patient at
first, when relatively few pruning operations have been attempted,
when the cost is basically still acceptable. But as more pruning
operations ran and clearly didn't free space that really should be
freed, we'd quickly lose patience.

The big idea here is to delay committing to any course of action for
as long as possible, so we wouldn't kill queries on standbys for very
little benefit on the primary, while at the same time avoiding ever
really failing to kill queries on standbys when the cost proved too
high on the primary. For this to have any chance of working it needs
to focus on the actual costs on the primary, and not some extremely
noisy proxy for that cost. The standby will have its query killed by
just one prune record affecting just one heap page, and delaying that
specific prune record is likely no big deal. It's preventing pruning
of tens of thousands of heap pages that we need to worry about.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2023-03-24 21:47:40 Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security
Previous Message Corey Huinker 2023-03-24 21:21:27 Re: Add SHELL_EXIT_CODE to psql