Re: Berserk Autovacuum (let's save next Mandrill)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-19 21:38:19
Message-ID: 20200319213819.ghjiilbisw33aqxe@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-19 06:45:48 +0100, Laurenz Albe wrote:
> On Tue, 2020-03-17 at 18:02 -0700, Andres Freund wrote:
> > I don't think a default scale factor of 0 is going to be ok. For
> > large-ish tables this will basically cause permanent vacuums. And it'll
> > sometimes trigger for tables that actually coped well so far. 10 million
> > rows could be a few seconds, not more.
> >
> > I don't think that the argument that otherwise a table might not get
> > vacuumed before autovacuum_freeze_max_age is convincing enough.
> >
> > a) if that's indeed the argument, we should increase the default
> > autovacuum_freeze_max_age - now that there's insert triggered vacuums,
> > the main argument against that from before isn't valid anymore.
> >
> > b) there's not really a good arguments for vacuuming more often than
> > autovacuum_freeze_max_age for such tables. It'll not be not frequent
> > enough to allow IOS for new data, and you're not preventing
> > anti-wraparound vacuums from happening.
>
> According to my reckoning, that is the remaining objection to the patch
> as it is (with ordinary freezing behavior).
>
> How about a scale_factor od 0.005? That will be high enough for large
> tables, which seem to be the main concern here.

Seems low on a first blush. On a large-ish table with 1 billion tuples,
we'd vacuum every 5 million inserts. For many ETL workloads this will
result in a vacuum after every bulk operation. Potentially with an index
scan associated (even if there's no errors, a lot of bulk loads use ON
CONFLICT INSERT leading to the occasional update).

Personally I think we should be considerably more conservative in the
first release or two. Exposing a lot of people that previously didn't
have a lot of problems to vacuuming being *massively* more aggressive,
basically permanently running on an insert only table, will be bad.

> I fully agree with your point a) - should that be part of the patch?

We can just make it a seperate patch committed shortly afterwards.

> I am not sure about b). In my mind, the objective is not to prevent
> anti-wraparound vacuums, but to see that they have less work to do,
> because previous autovacuum runs already have frozen anything older than
> vacuum_freeze_min_age. So, assuming linear growth, the number of tuples
> to freeze during any run would be at most one fourth of today's number
> when we hit autovacuum_freeze_max_age.

This whole chain of arguments seems like it actually has little to do
with vacuuming insert only/mostly tables. The same problem exists for
tables that aren't insert only/mostly. Instead it IMO is an argument for
a general change in logic about when to freeze.

What exactly is it that you want to achieve by having anti-wrap vacuums
be quicker? If the goal is to reduce the window in which autovacuums
aren't automatically cancelled when there's a conflicting lock request,
or in which autovacuum just schedules based on xid age, then you can't
have wraparound vacuums needing to do substantial amount of work.

Except for not auto-cancelling, and the autovac scheduling issue,
there's really nothing magic about anti-wrap vacuums.

If the goal is to avoid redundant writes, then it's largely unrelated to
anti-wrap vacuums, and can to a large degree addressed by
opportunistically freezing (best even during hot pruning!).

I am more and more convinced that it's a seriously bad idea to tie
committing "autovacuum after inserts" to also committing a change in
logic around freezing. That's not to say we shouldn't try to address
both this cycle, but discussing them as if they really are one item
makes it both more likely that we get nothing in, and more likely that
we miss the larger picture.

> I am still sorry to see more proactive freezing go, which would
> reduce the impact for truly insert-only tables.
> After sleeping on it, here is one last idea.
>
> Granted, freezing with vacuum_freeze_min_age = 0 poses a problem
> for those parts of the table that will receive updates or deletes.

IMO it's not at all just those regions that are potentially negatively
affected:
If there are no other modifications to the page, more aggressively
freezing can lead to seriously increased write volume. Its quite normal
to have databases where data in insert only tables *never* gets old
enough to need to be frozen (either because xid usage is low, or because
older partitions are dropped). If data in an insert-only table isn't
write-only, the hint bits are likely to already be set, which means that
vacuum will just cause the entire table to be written another time,
without a reason.

I don't see how it's ok to substantially regress this very common
workload. IMO this basically means that more aggressively and
non-opportunistically freezing simply is a no-go (be it for insert or
other causes for vacuuming).

What am I missing?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-03-19 21:54:35 Re: Improve errors when setting incorrect bounds for SSL protocols
Previous Message Bruce Momjian 2020-03-19 21:32:49 Re: Add PostgreSQL home page to --help output