Re: Berserk Autovacuum (let's save next Mandrill)

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Banck <mbanck(at)gmx(dot)net>
Subject: Re: Berserk Autovacuum (let's save next Mandrill)
Date: 2020-03-10 19:17:54
Message-ID: 63dd4d51b0525574a702ffbf58a5c585f2b0ede1.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2020-03-10 at 18:14 +0900, Masahiko Sawada wrote:

Thanks for the review and your thoughts!

> FYI actually vacuum could perform index cleanup phase (i.g.
> PROGRESS_VACUUM_PHASE_INDEX_CLEANUP phase) on a table even if it's a
> truly INSERT-only table, depending on
> vacuum_cleanup_index_scale_factor. Anyway, I also agree with not
> disabling index cleanup in insert-only vacuum case, because it could
> become not only a cause of index bloat but also a big performance
> issue. For example, if autovacuum on a table always run without index
> cleanup, gin index on that table will accumulate insertion tuples in
> its pending list and will be cleaned up by a backend process while
> inserting new tuple, not by a autovacuum process. We can disable index
> vacuum by index_cleanup storage parameter per tables, so it would be
> better to defer these settings to users.

Thanks for the confirmation.

> I have one question about this patch from architectural perspective:
> have you considered to use autovacuum_vacuum_threshold and
> autovacuum_vacuum_scale_factor also for this purpose? That is, we
> compare the threshold computed by these values to not only the number
> of dead tuples but also the number of inserted tuples. If the number
> of dead tuples exceeds the threshold, we trigger autovacuum as usual.
> On the other hand if the number of inserted tuples exceeds, we trigger
> autovacuum with vacuum_freeze_min_age = 0. I'm concerned that how user
> consider the settings of newly added two parameters. We will have in
> total 4 parameters. Amit also was concerned about that[1].
>
> I think this idea also works fine. In insert-only table case, since
> only the number of inserted tuples gets increased, only one threshold
> (that is, threshold computed by autovacuum_vacuum_threshold and
> autovacuum_vacuum_scale_factor) is enough to trigger autovacuum. And
> in mostly-insert table case, in the first place, we can trigger
> autovacuum even in current PostgreSQL, since we have some dead tuples.
> But if we want to trigger autovacuum more frequently by the number of
> newly inserted tuples, we can set that threshold lower while
> considering only the number of inserted tuples.

I am torn.

On the one hand it would be wonderful not to have to add yet more GUCs
to the already complicated autovacuum configuration. It already confuses
too many users.

On the other hand that will lead to unnecessary vacuums for small
tables.
Worse, the progression caused by the comparatively large scale
factor may make it vacuum large tables too seldom.

I'd be grateful if somebody knowledgeable could throw his or her opinion
into the scales.

> And I briefly looked at this patch:
>
> @@ -2889,7 +2898,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
> tab->at_params.truncate = VACOPT_TERNARY_DEFAULT;
> /* As of now, we don't support parallel vacuum for autovacuum */
> tab->at_params.nworkers = -1;
> - tab->at_params.freeze_min_age = freeze_min_age;
> + tab->at_params.freeze_min_age = freeze_all ? 0 : freeze_min_age;
> tab->at_params.freeze_table_age = freeze_table_age;
> tab->at_params.multixact_freeze_min_age = multixact_freeze_min_age;
> tab->at_params.multixact_freeze_table_age = multixact_freeze_table_age;
>
> I think we can set multixact_freeze_min_age to 0 as well.

Ugh, yes, that is a clear oversight.
I have fixed it in the latest version.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2020-03-10 20:49:49 Re: [PATCH] Use PKG_CHECK_MODULES to detect the libxml2 library
Previous Message Laurenz Albe 2020-03-10 19:08:39 Re: Berserk Autovacuum (let's save next Mandrill)