What specific circumstances trigger Autovacuum wraparound

From: Daniel Lagerman <spjheruur(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: What specific circumstances trigger Autovacuum wraparound
Date: 2018-06-05 02:51:34
Message-ID: CAAw8gvjNK5yBRgP6Xz=FgaVtrKs4igHAZr-Wcoy4w6K-+UzeRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello!

Please note that I have read:
https://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND

And I think i understand how the settings are supposed work, however I do
have a question in regards to Autovaccum wrap-around and how it is
triggered. I have a pg 9.4.3 server that has one table with calculate age
of about 320 million, this is the oldest table in this database and
therefore the age of the DB is the same as the table. This is calculated
with the SQL provided on the page above.

Before I get into the settings I have I will explain the behavior. Right
now there is 6 Autovaccum wraparound running, I have 6 workers configured
so nothing strange about that, but what puzzles me is that only 3 out of 6
tables has not recently completed an Autovaccum and only one table was over
the old limit when it started.

The settings that are set are as follows:
vacuum_freeze_min_age = 10000000
autovacuum_freeze_max_age = 600000000
vacuum_freeze_table_age = 400000000
vacuum_cost_delay = 10
vacuum_cost_limit = 2000
autovacuum_vacuum_cost_limit = 200
autovacuum_vacuum_cost_delay = 20

All other settings for vacuum/autovacuum are PG 9.4.3 default. And server
has been restarted after settings were changed.

If I read the article above and I quote "autovacuum will be invoked on it
approximately once every autovacuum_freeze_max_age minus
vacuum_freeze_min_age transactions" to me it indicates that this would be
around every 590 million transactions after I did my change. But maybe it
is to late once it has triggered? With the default values it would be
around 150 million which seems to add up more that what I see.

I also read and I quote " so it can be useful to try to maximize the
interval between forced autovacuums on very large static tables. Obviously
one can do this either by increasing autovacuum_freeze_max_age or
decreasing vacuum_freeze_min_age "

In my scenario the table is not static, but I interpret that you could
delay vacuum by changing these parameters.

What also puzzles is that even once PG completes an Autovaccum (all table
except 3 right now) the Autovacuum wraparound just loops over every table
once more. Some tables have been Autovaccumed wraparound 4-5 times since
the last restart of PG server 2 days ago. It seems to me that due to that
the database itself is over the default limit Autovacuum is running on a
loop for all tables to try and get it below that limit.

There are no normal vacuums as I can see, which means I can't run anything
like flexible freeze during off hours (middle of the night) without doing a
manual cancel query on the Autovaccum which might not be a good idea.

I did attempt to run flexible freeze for 45 hours during the weekend (with
very aggressive settings compared to Autovaccum), and it did not complete
the table which is around 1 TB large in that time. I'm worried that these
two vacuums are fighting/resetting each other and will never complete if I
keep doing that, so I have stopped that for now.

I have another server, same specs, where the same table is actually larger
in disk size, than the server i'm currently working on, where I do not see
this behavior and vacuum actually completed in a couple of days of running.

Hopefully I have not missed anything very obvious. I'm most interested in
and explanation of why PG is behaving as it is, as it's always good to
understand why something is happening even if you can't change it and also
in regards to cancel Autovacuum to run flexible freeze during off hours in
order to try and speed up the process, is it setting me back or not.

Best regards

Daniel

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jonathan S. Katz 2018-06-05 03:08:21 Re: Code of Conduct plan
Previous Message Michael Paquier 2018-06-05 01:22:02 Re: Microsoft buys GitHub, is this a threat to open-source