Re: Berserk Autovacuum (let's save next Mandrill)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-12 06:14:12
Message-ID: CAApHDvov+5zB92Tb=9roXQ81dH+f6MVYXRRod-z9NUYhVpduUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 12 Mar 2020 at 18:38, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
>
> On Thu, 2020-03-12 at 17:47 +1300, David Rowley wrote:
> > Laurenz, are you really set on the 10 million threshold?
>
> These values are almost the same as "autovacuum_vacuum_scale_factor"
> and "autovacuum_vacuum_threshold", so you actually agree with Masahiko
> with the exception that you want it tunable separately.
>
> I don't like the high scale factor.
>
> If your insert-only table was last vacuumed when it had 500 million rows,
> the next autovacuum will freeze 150 million tuples, which is a lot.
> The impact will be less than that of an anti-wraparound vacuum because
> it is not as persistent, but if our 150 million tuple autovacuum backs
> down because it hits a lock or gets killed by the DBA, that is also not
> good, since it will just come again.
> And the bigger the vacuum run is, the more likely it is to meet an obstacle.
>
> So I think that large insert-only tables should be vacuumed more often
> than that. If the number of tuples that have to be frozen is small,
> the vacuum run will be short and is less likely to cause problems.
> That is why I chose a scale factor of 0 here.

That's a good point. If those 150 million inserts were done one per
transaction, then it wouldn't take many more tuples before wraparound
vacuums occur more often than insert vacuums. The only way I see
around that is to a) configure it the way you'd like, or; b) add yet
another GUC and reloption to represent how close to
autovacuum_freeze_max_age / autovacuum_multixact_freeze_max_age the
table is. I'm not very excited about adding yet another GUC, plus
anti-wraparound vacuums already occur 10 times more often than they
need to. If we added such a GUC and set it to, say, 0.1, then they'd
happen 100 times more often than needed before actual wraparound
occurs.

I'm starting to see now why you were opposed to the scale_factor in
the first place.

I really think that this is really a problem with the design of the
threshold and scale_factor system. I used to commonly see people with
larger tables zeroing out the scale_factor and setting a reasonable
threshold or dropping the scale_factor down to some fraction of a
percent. I don't really have any better design in mind though, at
least not one that does not require adding new vacuum options.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-03-12 06:31:09 Re: Planning counters in pg_stat_statements (using pgss_store)
Previous Message Masahiko Sawada 2020-03-12 06:12:44 Re: Some problems of recovery conflict wait events