Re: Berserk Autovacuum (let's save next Mandrill)

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-20 06:17:40
Message-ID: 9c73162fa7e7bc8cfc80666f8702937c43a0aaeb.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2020-03-19 at 14:38 -0700, Andres Freund wrote:
> > I am not sure about b). In my mind, the objective is not to prevent
> > anti-wraparound vacuums, but to see that they have less work to do,
> > because previous autovacuum runs already have frozen anything older than
> > vacuum_freeze_min_age. So, assuming linear growth, the number of tuples
> > to freeze during any run would be at most one fourth of today's number
> > when we hit autovacuum_freeze_max_age.
>
> This whole chain of arguments seems like it actually has little to do
> with vacuuming insert only/mostly tables. The same problem exists for
> tables that aren't insert only/mostly. Instead it IMO is an argument for
> a general change in logic about when to freeze.

My goal was to keep individual vacuum runs from having too much
work to do. The freezing was an afterthought.

The difference (for me) is that I am more convinced that the insert
rate for insert-only table is constant over time than I am of the
update rate to be constant.

> What exactly is it that you want to achieve by having anti-wrap vacuums
> be quicker? If the goal is to reduce the window in which autovacuums
> aren't automatically cancelled when there's a conflicting lock request,
> or in which autovacuum just schedules based on xid age, then you can't
> have wraparound vacuums needing to do substantial amount of work.
>
> Except for not auto-cancelling, and the autovac scheduling issue,
> there's really nothing magic about anti-wrap vacuums.

Yes. I am under the impression that it is the duration and amount
of work per vacuum run that is the problem here, not the aggressiveness
as such.

If you are in the habit of frequently locking tables with high
lock modes (and I have seen people do that), you are lost anyway:
normal autovacuum runs will always die, and anti-wraparound vacuum
will kill you. There is nothing we can do about that, except perhaps
put a fat warning in the documentation of LOCK.

> If the goal is to avoid redundant writes, then it's largely unrelated to
> anti-wrap vacuums, and can to a large degree addressed by
> opportunistically freezing (best even during hot pruning!).
>
>
> I am more and more convinced that it's a seriously bad idea to tie
> committing "autovacuum after inserts" to also committing a change in
> logic around freezing. That's not to say we shouldn't try to address
> both this cycle, but discussing them as if they really are one item
> makes it both more likely that we get nothing in, and more likely that
> we miss the larger picture.

I hear you, and I agree that we shouldn't do it with this patch.

> If there are no other modifications to the page, more aggressively
> freezing can lead to seriously increased write volume. Its quite normal
> to have databases where data in insert only tables *never* gets old
> enough to need to be frozen (either because xid usage is low, or because
> older partitions are dropped). If data in an insert-only table isn't
> write-only, the hint bits are likely to already be set, which means that
> vacuum will just cause the entire table to be written another time,
> without a reason.
>
>
> I don't see how it's ok to substantially regress this very common
> workload. IMO this basically means that more aggressively and
> non-opportunistically freezing simply is a no-go (be it for insert or
> other causes for vacuuming).
>
> What am I missing?

Nothing that I can see, and these are good examples why eager freezing
may not be such a smart idea after all.

I think your idea of freezing everything on a page when we know it is
going to be dirtied anyway is the smartest way of going about that.

My only remaining quibbles are about scale factor and threshold, see
my other mail.

Yours,
Laurenz Albe

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-03-20 06:17:47 Re: Internal key management system
Previous Message Andres Freund 2020-03-20 06:00:45 Re: Improving connection scalability: GetSnapshotData()