Re: New IndexAM API controlling index vacuum strategies

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject: Re: New IndexAM API controlling index vacuum strategies
Date: 2021-03-23 03:28:15
Message-ID: CAH2-Wz=b9bYB474sQNQZtPSYw7TDRYZPequQ0yV4uBx8s9c3Yg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 22, 2021 at 6:41 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> But we're not sure when the next anti-wraparound vacuum will take
> place. Since the table is already vacuumed by a non-aggressive vacuum
> with disabling index cleanup, an autovacuum will process the table
> when the table gets modified enough or the table's relfrozenxid gets
> older than autovacuum_vacuum_max_age. If the new threshold, probably a
> new GUC, is much lower than autovacuum_vacuum_max_age and
> vacuum_freeze_table_age, the table is continuously vacuumed without
> advancing relfrozenxid, leading to unnecessarily index bloat. Given
> the new threshold is for emergency purposes (i.g., advancing
> relfrozenxid faster), I think it might be better to use
> vacuum_freeze_table_age as the lower bound of the new threshold. What
> do you think?

As you know, when the user sets vacuum_freeze_table_age to a value
that is greater than the value of autovacuum_vacuum_max_age, the two
GUCs have values that are contradictory. This contradiction is
resolved inside vacuum_set_xid_limits(), which knows that it should
"interpret" the value of vacuum_freeze_table_age as
(autovacuum_vacuum_max_age * 0.95) to paper-over the user's error.
This 0.95 behavior is documented in the user docs, though it happens
silently.

You seem to be concerned about a similar contradiction. In fact it's
*very* similar contradiction, because this new GUC is naturally a
"sibling GUC" of both vacuum_freeze_table_age and
autovacuum_vacuum_max_age (the "units" are the same, though the
behavior that each GUC triggers is different -- but
vacuum_freeze_table_age and autovacuum_vacuum_max_age are both already
*similar and different* in the same way). So perhaps the solution
should be similar -- silently interpret the setting of the new GUC to
resolve the contradiction.

(Maybe I should say "these two new GUCs"? MultiXact variant might be needed...)

This approach has the following advantages:

* It follows precedent.

* It establishes that the new GUC is a logical extension of the
existing vacuum_freeze_table_age and autovacuum_vacuum_max_age GUCs.

* The default value for the new GUC will be so much higher (say 1.8
billion XIDs) than even the default of autovacuum_vacuum_max_age that
it won't disrupt anybody's existing postgresql.conf setup.

* For the same reason (the big space between autovacuum_vacuum_max_age
and the new GUC with default settings), you can almost set the new GUC
without needing to know about autovacuum_vacuum_max_age.

* The overall behavior isn't actually restrictive/paternalistic. That
is, if you know what you're doing (say you're testing the feature) you
can reduce all 3 sibling GUCs to 0 and get the testing behavior that
you desire.

What do you think?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2021-03-23 03:33:20 Re: Disable WAL logging to speed up data loading
Previous Message Justin Pryzby 2021-03-23 03:23:22 Re: [POC] Fast COPY FROM command for the table with foreign partitions