Re: Berserk Autovacuum (let's save next Mandrill)

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-12 04:47:57
Message-ID: CAApHDvoUhTFWCGmhfwct6qcXUe1BjaN9KvMEricbEShetZhAQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 11 Mar 2020 at 19:00, Masahiko Sawada
<masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 11 Mar 2020 at 13:24, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> > If you have an insert-only table that has 100000 entries, it will get
> > vacuumed roughly every 20000 new entries. The impact is probably too
> > little to care, but it will increase the contention for the three
> > autovacuum workers available by default.
>
> The same is true for read-write table, right? If that becomes a
> problem, it's a mis-configuration and user should increase these
> values just like when we set these values for read-write tables.

It is true that if vacuum has more to do than it can do, then
something is not configured correctly.

I imagine Laurenz set the scale factor to 0.0 and the threshold to 10
million to reduce the chances that someone will encounter that
problem. I mentioned somewhere upthread that commonly used to see
production servers running with the standard vacuum_cost_limit of 200
and the (pre-PG12) autovacuum_vacuum_cost_delay of 20. Generally, it
didn't go well for them. autovacuum_vacuum_cost_delay is now 2 by
default, so auto-vacuum in PG12 and beyond runs 10x faster, but it's
still pretty conservative and it'll still need another bump in several
years when hardware is faster than it is today. So, by no means did
that 10x increase mean that nobody will suffer from auto-vacuum
starvation ever again.

Now, perhaps it remains to be seen if adding additional work onto
auto-vacuum will help or hinder those people. If their auto-vacuum
can just keep up until the cluster is old enough to need
anti-wraparound vacuums and then falls massively behind, then perhaps
this is a good thing as they might notice at some point before their
server explodes in the middle of the night. By that time they might
have become complacent. Additionally, I think this is pretty well
aligned to the case mentioned in the subject line of this email. We
now have a freeze map, so performing vacuums to freeze tuples twice as
often is not really much more expensive in total than doing that
vacuuming half as often. Even tables (e.g log tables) that are never
queried won't become much more costly to maintain. In the meantime,
for tables that do receive queries, then we're more likely to get an
index-only scan.

Perhaps a good way to decide what the scale_factor should be set to
should depend on the run-time of an Index Only Scan, vs an Index Scan.

create table ios (a int, b text);
insert into ios select x,x::text from generate_series(1,1000000)x;
create index on ios (a);
vacuum analyze ios;

explain (analyze, buffers) select a from ios order by a; -- on 2nd exec
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using ios_a_idx on ios (cost=0.42..25980.42
rows=1000000 width=4) (actual time=0.035..212.602 rows=1000000
loops=1)
Heap Fetches: 0
Buffers: shared hit=2736
Planning Time: 0.095 ms
Execution Time: 246.864 ms
(5 rows)

set enable_indexonlyscan=0;
explain (analyze, buffers) select a from ios order by a;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Index Scan using ios_a_idx on ios (cost=0.42..31388.42 rows=1000000
width=4) (actual time=0.036..451.381 rows=1000000 loops=1)
Buffers: shared hit=8140
Planning Time: 0.089 ms
Execution Time: 486.582 ms
(4 rows)

So about twice as fast with the IOS. When it's going to be beneficial
to perform the vacuum will depend on the reads to insert ratio. I'm
starting to think that we should set the scale_factor to something
like 0.3 and the threshold to 50. Is anyone strongly against that? Or
Laurenz, are you really set on the 10 million threshold?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2020-03-12 05:21:26 Re: [PATCH] Replica sends an incorrect epoch in its hot standby feedback to the Master
Previous Message Michael Paquier 2020-03-12 04:39:06 Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line