25.2. Autovacuum Maintenance Tasks #

25.2.1. Recovering Disk Space
25.2.2. Freezing to manage the transaction ID space
25.2.3. Updating the Visibility Map
25.2.4. Truncating Transaction Status Information
25.2.5. Updating Planner Statistics

PostgreSQL databases require periodic maintenance known as vacuuming, and require periodic updates to the statistics used by the PostgreSQL query planner. The VACUUM and ANALYZE commands perform these maintenance tasks. The autovacuum daemon automatically schedules maintenance tasks based on workload requirements.

The autovacuum daemon has to process each table regularly for several reasons:

  1. To recover or reuse disk space occupied by updated or deleted rows.
  2. To maintain the system's ability to allocate transaction IDs through freezing.
  3. To update the visibility map, which speeds up index-only scans, and helps the next VACUUM operation avoid needlessly scanning already-frozen pages.
  4. To enable truncation of obsolescent transaction status information in structures such as pg_xact for the entire cluster.
  5. To update data statistics used by the PostgreSQL query planner.

The first four maintenance tasks are handled by running VACUUM from within an autovacuum worker process. The fifth and final task (maintenance of planner statistics) is handled by running ANALYZE from within an autovacuum worker process.

Generally speaking, database administrators new to tuning autovacuum should start by considering the need to adjust autovacuum's scheduling. Autovacuum scheduling is controlled via threshold settings. These settings determine when autovacuum should launch a worker to run VACUUM and/or ANALYZE; see the previous section, Section 25.1. This section provides additional information about the design and goals of autovacuum, VACUUM, and ANALYZE. The intended audience is database administrators that wish to perform more advanced autovacuum tuning, with any of the following goals in mind:

With larger installations, tuning autovacuum usually won't be a once-off task; it is best to approach tuning as an iterative, applied process.

Autovacuum might create a lot of I/O traffic at times, which can cause poor performance for other active sessions. There are configuration parameters you can adjust to reduce the impact on system response time. See the autovacuum-specific cost delay settings described in Section 20.10, and additional cost delay settings described in Section 20.4.4.

Database administrators might also find it useful to supplement the daemon's activities with manually-managed VACUUM commands. Scripting tools like cron and Task Manager can help with this. It can be useful to perform off-hours VACUUM commands during periods when the application experiences less demand (e.g., on weekends, or in the middle of the night). This section applies equally to manually-issued VACUUM and ANALYZE operations, except where otherwise noted.

Tip

You can monitor VACUUM progress (whether run by autovacuum or manually) via the pg_stat_progress_vacuum view. See Section 28.5.5.

25.2.1. Recovering Disk Space #

In PostgreSQL, an UPDATE or DELETE of a row does not immediately remove the old version of the row. This approach is necessary to gain the benefits of multiversion concurrency control (MVCC, see Chapter 13): the row version must not be deleted while it is still potentially visible to other transactions. A deleted row version (whether from an UPDATE or DELETE) will usually cease to be of interest to any still-running transaction shortly after the original deleting transaction commits.

The space dead tuples occupy must eventually be reclaimed for reuse by new rows, to avoid unbounded growth of disk space requirements. Reclaiming space from dead rows is VACUUM's main responsibility.

The transaction ID number (XID) based cutoff point that VACUUM uses to determine if a deleted tuple is safe to physically remove is reported under removable cutoff in the server log when autovacuum logging (controlled by log_autovacuum_min_duration) reports on a VACUUM operation executed by autovacuum. Tuples that are not yet safe to remove are counted as dead but not yet removable tuples in the log report. VACUUM establishes its removable cutoff once, at the start of the operation. Any older MVCC snapshot (or transaction that allocates an XID) that's still running when the cutoff is established may hold it back.

Caution

It's critical that no long-running transactions are allowed to hold back every VACUUM operation's cutoff for an extended period. It may be a good idea to add monitoring to alert you about this.

Note

VACUUM can remove tuples inserted by aborted transactions immediately

VACUUM usually doesn't return space to the operating system. There is one exception: space is returned to the OS whenever a group of contiguous pages appears at the end of a table. VACUUM must acquire an ACCESS EXCLUSIVE lock to perform relation truncation. You can disable relation truncation by setting the table's vacuum_truncate storage parameter to off.

Tip

If you have a table whose entire contents are deleted periodically, consider using TRUNCATE rather than DELETE. TRUNCATE removes the entire table's contents immediately, obviating the need for VACUUM. One disadvantage is that strict MVCC semantics are violated.

Tip

VACUUM FULL (or CLUSTER) can be useful when dealing with extreme amounts of dead tuples. It can reclaim more disk space, but it is much slower, and usually more disruptive. VACUUM FULL rewrites an entire new copy of the table and rebuilds all of the table's indexes. This makes it suitable for highly fragmented tables, and tables where significant amounts of space can be reclaimed.

Note

Although VACUUM FULL is technically an option of the VACUUM command, VACUUM FULL uses a completely different implementation. VACUUM FULL is essentially a variant of CLUSTER. (The name VACUUM FULL is historical; the original implementation was closer to standard VACUUM.)

Warning

TRUNCATE, VACUUM FULL, and CLUSTER all require an ACCESS EXCLUSIVE lock, which can be highly disruptive (SELECT, INSERT, UPDATE, and DELETE commands can't run at the same time).

Warning

VACUUM FULL and CLUSTER temporarily use extra disk space. The extra space required is approximately equal to the size of the table, since the old copies of the table and indexes can't be released until the new ones are complete.

25.2.2. Freezing to manage the transaction ID space #

VACUUM often marks some of the pages that it scans frozen, indicating that all eligible rows on the page were inserted by a transaction that committed sufficiently far in the past that the effects of the inserting transaction are certain to be visible to all current and future transactions. The specific transaction ID number (XID) stored in a frozen heap row's xmin field is no longer needed to determine its visibility. Furthermore, when a row undergoing freezing has an XID set in its xmax field (e.g., an XID left behind by an earlier SELECT FOR UPDATE row locker), the xmax field's XID is usually also removed.

Once frozen, heap pages are self-contained. Every query can read all of the page's rows in a way that assumes that the inserting transaction committed and is visible to its MVCC snapshot. No query will ever have to consult external transaction status metadata to interpret the page's contents, either. In particular, pg_xact transaction XID commit/abort status lookups won't occur during query execution.

Freezing is a WAL-logged operation, so when VACUUM freezes a heap page, any copy of the page located on a physical replication standby server will itself be frozen shortly thereafter (when the relevant FREEZE_PAGE WAL record is replayed on the standby). Queries that run on physical replication standbys avoid pg_xact lookups when reading from frozen pages, just like queries that run on the primary server [15].

VACUUM generally postpones some freezing work as an optimization, but VACUUM cannot delay freezing forever. Since on-disk storage of transaction IDs in heap row headers uses a truncated partial 32-bit representation to save space (rather than the full 64-bit representation used in other contexts), it plays a crucial role in enabling management of the XID address space by VACUUM. If, for whatever reason, VACUUM is unable to freeze older XIDs on behalf of an application that continues to require XID allocations, the system will eventually refuse to allocate transaction IDs due to transaction ID exhaustion (though this is unlikely to occur unless autovacuum is configured incorrectly).

vacuum_freeze_min_age controls when freezing takes place. When VACUUM scans a heap page containing even one XID that has already attained an age exceeding this value, the page is frozen.

Multixact IDs support row locking by multiple transactions. Since there is only limited space in a heap tuple header to store lock information, that information is encoded as a multiple transaction ID, or Multixact ID for short, whenever there is more than one transaction concurrently locking a row. Information about which transaction IDs are included in any particular Multixact ID is stored separately in pg_multixact. Only the Multixact ID itself (a 32-bit integer) appears in the tuple's xmax field. This creates a dependency on external Multixact ID transaction status information. This is similar to the dependency ordinary unfrozen XIDs have on commit status information from pg_xact. VACUUM must therefore occasionally remove Multixact IDs from tuples during freezing.

vacuum_multixact_freeze_min_age also controls when freezing takes place. It is analogous to vacuum_freeze_min_age, but age is expressed in Multixact ID units. Lowering vacuum_multixact_freeze_min_age forces VACUUM to process xmax fields with a Multixact ID in cases where it would otherwise postpone the work of processing xmax until the next VACUUM [16]. The setting generally doesn't significantly influence the total number of pages VACUUM freezes, even in tables containing many Multixact IDs. This is because VACUUM generally prefers proactive processing for most individual xmax fields that contain a Multixact ID (eager proactive processing is typically cheaper).

Managing the added WAL volume from freezing over time is a vital consideration for VACUUM. It is why VACUUM doesn't just freeze every eligible tuple at the earliest opportunity: the WAL written to freeze a page's tuples is wasted in cases where the resulting frozen tuples are soon deleted or updated anyway. It's also why VACUUM will freeze all eligible tuples from a heap page once the decision to freeze at least one tuple is taken: at that point, the added cost of freezing all eligible tuples eagerly (measured in extra bytes of WAL written) is far lower than the probable cost of deferring freezing until a future VACUUM operation against the same table. Furthermore, once the page is frozen, it can generally be marked as all-frozen within the visibility map immediately afterwards.

Note

In PostgreSQL versions before 16, VACUUM triggered freezing at the level of individual xmin and xmax fields. Freezing only affected the exact XIDs that had already attained an age of vacuum_freeze_min_age or greater.

VACUUM also triggers the freezing of a page in cases where it already proved necessary to write out a full page image (FPI) as part of a WAL record describing how dead tuples were removed [17] (see Section 30.1 for background information about how FPIs provide torn page protection). This freeze on an FPI write batching mechanism avoids an expected additional FPI for the same page later on (this is the probable outcome of lazily deferring freezing until vacuum_freeze_min_age forces it). In effect, VACUUM generates slightly more WAL in the short term with the aim of ultimately needing to generate much less WAL in the long term.

Tip

For tables that receive INSERT operations, but few or no UPDATE/DELETE operations, it might be beneficial to lower autovacuum_freeze_min_age for the table. This makes VACUUM freeze the table's pages eagerly during earlier autovacuums triggered by autovacuum_vacuum_insert_scale_factor, which improves performance stability for some workloads.

25.2.2.1. Aggressive VACUUM #

As noted already, freezing doesn't just allow queries to avoid lookups of subsidiary transaction status information in structures such as pg_xact. Freezing also plays a crucial role in enabling transaction ID address space management by VACUUM (and autovacuum). VACUUM maintains information about the oldest unfrozen XID that remains in the table when it uses its aggressive strategy.

Aggressive VACUUM updates the table's pg_class.relfrozenxid to whatever XID was the oldest observed XID that VACUUM didn't freeze still remaining at the end of processing. The table's relfrozenxid advances by a certain number of XIDs (relative to the previous value set during the last aggressive VACUUM) as progress on freezing the oldest pages in the table permits. Aggressive VACUUM will occasionally need to advance the whole database's pg_database.datfrozenxid afterwards, too — this is the minimum of the per-table relfrozenxid values (i.e., the earliest relfrozenxid) within the database.

Aggressive VACUUM may need to perform significant amounts of catch-up freezing missed by earlier non-aggressive VACUUMs, because non-aggressive VACUUM sometimes allows unfrozen pages to build up.

Over time, aggressive autovacuuming has two beneficial effects on the system as a whole:

  1. It keeps track of the oldest remaining unfrozen transaction ID in the entire database cluster (i.e., the oldest transaction ID across every table in every database).
  2. It avoids a cluster-wide oldest unfrozen transaction ID that is too old.

The maximum XID age that the system can tolerate (i.e., the maximum distance between the oldest unfrozen transaction ID in any table in the database cluster and the next unallocated transaction ID) is about 2.1 billion transaction IDs. This maximum XID age invariant makes it fundamentally impossible to postpone aggressive VACUUMs (and freezing) forever. While there is no simple formula for determining an oldest XID age for database administrators to target, the invariant imposes a 2.1 billion XID age hard limit — so there is a clear point at which unfrozen XIDs should always be considered too old, regardless of individual application requirements or workload characteristics. If the hard limit is reached, the system experiences transaction ID exhaustion, which temporarily prevents the allocation of new permanent transaction IDs. The system will only regain the ability to allocate new transaction IDs when VACUUM succeeds in advancing the oldest datfrozenxid in the cluster (following an aggressive VACUUM that runs to completion against the table with the oldest relfrozenxid).

Aggressive VACUUM also maintains the pg_class.relminmxid and pg_database.datminmxid fields. These are needed to track the oldest Multixact ID in the table and database, respectively. There are analogous rules, driven by analogous considerations about managing the Multixact ID space. This doesn't usually affect aggressive vacuuming requirements to a noticeable degree, but can in databases that consume more Multixact IDs than transaction IDs.

Caution

VACUUM may not always freeze tuple xmin XIDs that have reached vacuum_freeze_min_age in age. The basic eligibility criteria for freezing is the same as the criteria that determines if a deleted tuple is safe for VACUUM to remove: the XID-based removable cutoff (this is one of the details that appears in the server log's reports on autovacuum [18]).

In extreme cases, a long-running transaction can hold back every VACUUM's removable cutoff for so long that the system experiences transaction ID exhaustion. See Section 28.3 for details on how to monitor relfrozenxid and relminmxid age to avoid transaction ID/Multixact ID exhaustion.

These issues can be debugged by following autovacuum log reports from the server log over time: the log reports will include information about the age of each VACUUM's removable cutoff at the point the VACUUM ended. It may be useful to correlate the use of a cutoff with an excessively high age with application-level problems such as long-running transactions.

The 2.1 billion XIDs maximum XID age invariant must be preserved because transaction IDs stored in heap tuple headers use a truncated 32-bit representation (rather than the full 64-bit representation used in other contexts). Since all unfrozen transaction IDs from heap tuple headers must be from the same transaction ID epoch (or from a space in the 64-bit representation that spans two adjoining transaction ID epochs), there isn't any need to include a separate epoch field in each tuple header (see Section 74.1.2.1 for further details). This scheme requires much less on-disk storage space than a design that stores full 64-bit XIDs (consisting of a 32-bit epoch and a 32-bit partial XID) in heap tuple headers. On the other hand, it constrains the system's ability to allocate new XIDs in the worst case scenario where transaction ID exhaustion occurs.

There is only one major behavioral difference between aggressive VACUUM and non-aggressive VACUUM: non-aggressive VACUUM skips pages marked as all-visible using the visibility map, whereas aggressive VACUUM only skips the subset of pages that are both all-visible and all-frozen. In other words, pages that are just all-visible at the beginning of an aggressive VACUUM must be scanned, not skipped. Scanning existing all-visible pages is necessary to determine the oldest unfrozen XID that will remain in the table at the end of an aggressive VACUUM.

Note

In practice, most tables require periodic aggressive vacuuming. However, some individual non-aggressive VACUUM operations can advance the table's relfrozenxid and/or relminmxid.

This happens whenever a non-aggressive VACUUM notices that it is safe without incurring any added cost from scanning extra pages. It is most common in small, frequently modified tables.

Non-aggressive VACUUMs can sometimes overlook older XIDs from existing all-visible pages (due to their policy of always skipping all-visible pages). Over time, this can even lead to a significant build-up of unfrozen pages in one table (accumulated all-visible pages that remain unfrozen). When that happens, it is inevitable that an aggressive VACUUM will eventually need to perform catch-up freezing that clears the table's backlog of unfrozen pages.

There is also one minor behavioral difference between aggressive VACUUM and non-aggressive VACUUM: only aggressive VACUUM is required to sometimes wait for a page-level cleanup lock when a page is scanned and observed to contain transaction IDs/Multixact IDs that must be frozen. This difference exists because aggressive VACUUM is strictly required to advance relfrozenxid and/or relminmxid to sufficiently recent values [19]. The behavior can lead to occasional waits for a conflicting buffer pin to be released by another backend. These waits are imperceptible and harmless most of the time. In extreme cases there can be extended waits, which can be observed under the BufferPin wait event in the pg_stat_activity view. See Table 28.4.

Note

Catch-up freezing is not caused by any difference in how vacuum_freeze_min_age is applied by each type of VACUUM. It is an indirect result of vacuum_freeze_min_age only being applied to those pages that VACUUM scans (and cleanup locks) in the first place. Therefore, it can be difficult to tune vacuum_freeze_min_age, especially for tables that receive frequent non-aggressive VACUUMs and infrequent aggressive VACUUMs.

Tip

Autovacuum server log reports [18] show how many transaction IDs relfrozenxid advanced by (if at all), and how many Multixact IDs relminmxid advanced by (if at all).

The number of pages frozen, and the number of pages scanned (i.e., the number of pages processed because they could not skipped using the visibility map) are also shown. This can provide useful guidance when tuning freezing-related settings, particularly vacuum_freeze_table_age and vacuum_freeze_min_age.

vacuum_freeze_table_age controls when VACUUM uses its aggressive strategy. If age(relfrozenxid) exceeds vacuum_freeze_table_age at the start of VACUUM, VACUUM will employ its aggressive strategy; otherwise, its standard non-aggressive strategy is employed. Setting vacuum_freeze_table_age to 0 forces VACUUM to always use its aggressive strategy.

vacuum_multixact_freeze_table_age also controls when VACUUM uses its aggressive strategy. This is an independent Multixact ID based trigger for aggressive VACUUM, which works just like vacuum_freeze_table_age. It is applied against mxid_age(relminmxid) at the start of each VACUUM.

It doesn't matter if it was vacuum_freeze_table_age or vacuum_multixact_freeze_table_age that triggered VACUUM's decision to use its aggressive strategy. Every aggressive VACUUM will advance relfrozenxid and relminmxid by applying the same generic policy that controls which pages are frozen.

The default vacuum_freeze_table_age and vacuum_multixact_freeze_table_age settings are relatively low values. The vacuum_freeze_table_age and vacuum_freeze_min_age defaults are intended to limit the system to using only about 10% of the available transaction ID space at any one time. This leaves the system with a generous amount of slack capacity that allows XID allocations to continue in the event of unforeseen problems with autovacuum and/or the application. There might only be a negligible benefit from higher settings that aim to reduce the number of VACUUMs that use the aggressive strategy, in any case. Some applications may even benefit from tuning that makes autovacuum perform aggressive VACUUMs more often. If individual aggressive VACUUMs can perform significantly less catch-up freezing as a result, overall transaction processing throughput is likely to be more stable and predictable.

25.2.2.2. Anti-Wraparound Autovacuums #

To ensure that every table has its relfrozenxid (and relminmxid) advanced at regular intervals, even in the case of completely static tables, autovacuum runs against any table when it attains an age considered too far in the past. These are anti-wraparound autovacuums. In practice, all anti-wraparound autovacuums will use VACUUM's aggressive strategy (if they didn't, it would defeat the whole purpose of anti-wraparound autovacuuming).

autovacuum_freeze_max_age controls when the autovacuum daemon launches anti-wraparound autovacuums. If the age(relfrozenxid) of a table exceeds autovacuum_freeze_max_age when the autovacuum daemon periodically examines the database (which happens once every autovacuum_naptime seconds), then an anti-wraparound autovacuum is launched against the table.

autovacuum_multixact_freeze_max_age also controls when the autovacuum daemon launches anti-wraparound autovacuums. It is an independent Multixact ID based trigger for anti-wraparound autovacuuming. If the mxid_age(relminmxid) of a table exceeds autovacuum_multixact_freeze_max_age when the autovacuum daemon periodically examines the database [20], then an anti-wraparound autovacuum is launched against the table.

Use of VACUUM's aggressive strategy during anti-wraparound autovacuuming is certain, because the effective value of vacuum_freeze_table_age is silently limited to an effective value no greater than 95% of the current value of autovacuum_freeze_max_age. Similarly, the effective value of vacuum_multixact_freeze_table_age is silently limited to a value no greater than 95% of the current value of autovacuum_multixact_freeze_max_age.

It doesn't matter if it was autovacuum_freeze_max_age or autovacuum_multixact_freeze_max_age that triggered an anti-wraparound autovacuum. Every anti-wraparound autovacuum will be an aggressive VACUUM, and will therefore advance relfrozenxid and relminmxid by applying the same generic policy that controls which pages are frozen.

Anti-wraparound autovacuums are intended for static (and mostly static) tables. There is no reason to expect that a table receiving continual row inserts and/or row modifications will ever require an anti-wraparound autovacuum. As a rule of thumb, autovacuum_freeze_max_age should be set to a value somewhat above vacuum_freeze_table_age, so that there is a long window during which any autovacuum triggered by inserts, updates, or deletes (or any manually issued VACUUM) will become an aggressive VACUUM. This has the advantage of allowing aggressive vacuuming to take place at a time when vacuuming was required anyway. Each aggressive VACUUM can therefore be expected to perform just as much useful work on recovering disk space as an equivalent non-aggressive VACUUM would have (had the non-aggressive strategy been chosen instead).

Aggressive/anti-wraparound differences

Aggressive VACUUM is a special type of VACUUM. It must advance relfrozenxid up to a value that was no greater than vacuum_freeze_min_age in age as of the start of the VACUUM operation.

Anti-wraparound autovacuum is a special type of autovacuum. Its purpose is to ensure that relfrozenxid advances when no earlier VACUUM could advance it in passing — often because no VACUUM has run against the table for an extended period.

There is only one runtime behavioral difference between anti-wraparound autovacuums and other autovacuums that run aggressive VACUUMs: anti-wraparound autovacuums cannot be autocancelled. This means that autovacuum workers that perform anti-wraparound autovacuuming do not yield to conflicting relation-level lock requests (e.g., from ALTER TABLE). See Section 25.1.3 for a full explanation.

In practice, anti-wraparound autovacuum is very likely to be the type of autovacuum that updates the oldest relfrozenxid in each database to a more recent value due to the presence of completely static tables [21]. As discussed in Section 25.2.2.1, datfrozenxid only advances when the oldest relfrozenxid in the database advances (relminmxid likewise only advances when the earliest datminmxid in the database advances). This implies that anti-wraparound autovacuum is also very likely to be involved when any database's datfrozenxid/datminmxid advances (and when the cluster-wide earliest unfrozen transaction ID/Multixact ID is advanced to a more recent value, in turn).

It follows that autovacuum_freeze_max_age is usually the limiting factor for advancing the cluster-wide oldest unfrozen transaction ID found in pg_control (the cluster-wide oldest unfrozen Multixact ID might occasionally be influenced by autovacuum_multixact_freeze_max_age, too). This usually isn't much of a concern in itself, since it generally doesn't predict anything about how far behind autovacuum is with freezing physical heap pages. Note, however, that this effect can significantly impact the amount of space required to store transaction status information. The oldest transaction status information (which is stored in external structures such as pg_xact) cannot safely be truncated until VACUUM can ascertain that there are no references to the oldest entries remaining in any table, from any database. See Section 25.2.4 for further details.

25.2.2.3. Transaction ID exhaustion #

If for some reason autovacuum fails to advance any table's relfrozenxid for an extended period (during which transaction IDs continue to be allocated), the system will begin to emit warning messages once the database's oldest XIDs attain an age within forty million transactions of the 2.1 billion XID hard limit described in Section 25.2.2.1. For example:

WARNING:  database "mydb" must be vacuumed within 39985967 transactions
HINT:  To avoid a database shutdown, execute a database-wide VACUUM in that database.

(A manual VACUUM should fix the problem, as suggested by the hint; but note that the VACUUM must be performed by a superuser, else it will fail to process system catalogs and thus not be able to advance the database's datfrozenxid.) If these warnings are ignored, the system will eventually refuse to allocate new transaction IDs. This happens at the point that there are fewer than three million transactions left:

ERROR:  database is not accepting commands to avoid wraparound data loss in database "mydb"
HINT:  Stop the postmaster and vacuum that database in single-user mode.

The three-million-transaction safety margin exists to let the administrator recover without data loss, by manually executing the required VACUUM commands. However, since the system will not execute commands once it has gone into the safety shutdown mode, the only way to do this is to stop the server and start the server in single-user mode to execute VACUUM. The shutdown mode is not enforced in single-user mode. See the postgres reference page for details about using single-user mode.

A similar safety mechanism is used to prevent Multixact ID allocations when any table's relminmxid is dangerously far in the past: Multixact ID exhaustion. If the system isn't experiencing transaction ID exhaustion, Multixact ID exhaustion can be fixed non-invasively by running a manual VACUUM without entering single-user mode (otherwise follow the procedure for transaction ID exhaustion). See Section 28.3 for details on how to determine which table's relminmxid is dangerously far in the past.

Note

Autovacuum has two different mechanisms that are designed to avoid transaction ID exhaustion. The first mechansim is anti-wraparound autovacuuming. There is an second, independent mechanism, used when relfrozenxid and/or relminmxid have already consumed a significant fraction of the total available transaction ID space: the failsafe.

The failsafe is triggered by VACUUM when the table's relfrozenxid attains an age of vacuum_failsafe_age XIDs, or when the table's relminmxid attains an age of vacuum_multixact_failsafe_age Multixact IDs. This happens dynamically, when the risk of eventual transaction ID (or Multixact ID) exhaustion is deemed to outweigh the risks of not proceeding as planned with ordinary vacuuming.

Once the failsafe triggers, VACUUM prioritizes advancing relfrozenxid and/or relminmxid to avoid transaction ID exhaustion. Most notably, VACUUM bypasses any remaining non-essential maintenance, such as index vacuuming.

25.2.3. Updating the Visibility Map #

Vacuum maintains a visibility map for each table to keep track of which pages contain only tuples that are known to be visible to all active transactions (and all future transactions, until the page is again modified). A separate bit tracks whether all of the tuples are frozen.

The visibility map serves two purposes.

First, VACUUM itself can skip such pages on the next run, since there is nothing to clean up. Even aggressive VACUUMs can skip pages that are both all-visible and all-frozen.

Second, it allows PostgreSQL to answer some queries using only the index, without reference to the underlying table. Since PostgreSQL indexes don't contain tuple visibility information, a normal index scan fetches the heap tuple for each matching index entry, to check whether it should be seen by the current transaction. An index-only scan, on the other hand, checks the visibility map first. If it's known that all tuples on the page are visible, the heap fetch can be skipped. This is most useful on large data sets where the visibility map can prevent disk accesses. The visibility map is vastly smaller than the heap, so it can easily be cached even when the heap is very large.

25.2.4. Truncating Transaction Status Information #

As discussed in Section 25.2.2.1, aggressive autovacuuming plays a critical role in maintaining the XID address space for the system as a whole. A secondary goal of this whole process is to enable eventual truncation of the oldest transaction status information in the database cluster as a whole. This status information is stored in dedicated simple least-recently-used (SLRU) caches backed by external storage (see Section 73.1). Truncation is only possible when VACUUM can ascertain that there are no references to the oldest entries remaining in any table, from any database, by taking the earliest datfrozenxid and datminmxid among all databases in the cluster. This isn't a maintenance task that affects individual tables; it's a maintenance task that affects the whole cluster.

The space required to store transaction status information is likely to be a low priority for most database administrators. It may occasionally be useful to limit the maximum storage overhead used for transaction status information by making anti-wraparound autovacuums happen more frequently. The frequency of system-wide anti-wraparound autovacuuming increases when autovacuum_freeze_max_age and autovacuum_multixact_freeze_max_age are decreased in postgresql.conf. This approach is effective (at limiting the storage required for transaction status information) because the oldest datfrozenxid and datminmxid in the cluster are very likely to depend on the frequency of anti-wraparound autovacuuming of completely static tables. See Section 25.2.2.2 for further discussion of the role of anti-wraparound autovacuuming in advancing the cluster-wide oldest unfrozen transaction ID.

There are two SLRU storage areas associated with transaction IDs. First, there is pg_xact, which stores commit/abort status information. Second, there is pg_commit_ts, which stores transaction commit timestamps (when track_commit_timestamp is set to on). The default autovacuum_freeze_max_age setting of 200 million transactions translates to about 50MB of pg_xact storage, and about 2GB of pg_commit_ts storage when track_commit_timestamp is enabled (it is set to off by default, which totally avoids the need to store anything in pg_commit_ts).

There are also two SLRU storage areas associated with Multixact IDs: pg_multixact/members, and pg_multixact/offsets. These are logically one storage area, implemented as two distinct storage areas. There is no simple formula to determine the storage overhead per Multixact ID, since Multixact IDs have a variable number of member transaction IDs (this is what necessitates using two different physical storage areas). Note, however, that if pg_multixact/members exceeds 2GB, the effective value of autovacuum_multixact_freeze_max_age used by autovacuum (and VACUUM) will be lower. This results in more frequent anti-wraparound autovacuums (since that's the only approach that reliably limits the size of these storage areas). It might also increase the frequency of aggressive VACUUMs more generally.

25.2.5. Updating Planner Statistics #

The PostgreSQL query planner relies on statistical information about the contents of tables in order to generate good plans for queries. These statistics are gathered by the ANALYZE command, which can be invoked by itself or as an optional step in VACUUM. It is important to have reasonably accurate statistics, otherwise poor choices of plans might degrade database performance.

The autovacuum daemon, if enabled, will automatically issue ANALYZE commands whenever the content of a table has changed sufficiently. However, administrators might prefer to rely on manually-scheduled ANALYZE operations, particularly if it is known that update activity on a table will not affect the statistics of interesting columns. The daemon schedules ANALYZE strictly as a function of the number of rows inserted or updated; it has no knowledge of whether that will lead to meaningful statistical changes.

Tuples changed in partitions and inheritance children do not trigger analyze on the parent table. If the parent table is empty or rarely changed, it may never be processed by autovacuum, and the statistics for the inheritance tree as a whole won't be collected. It is necessary to run ANALYZE on the parent table manually in order to keep the statistics up to date.

As with vacuuming for space recovery, frequent updates of statistics are more useful for heavily-updated tables than for seldom-updated ones. But even for a heavily-updated table, there might be no need for statistics updates if the statistical distribution of the data is not changing much. A simple rule of thumb is to think about how much the minimum and maximum values of the columns in the table change. For example, a timestamp column that contains the time of row update will have a constantly-increasing maximum value as rows are added and updated; such a column will probably need more frequent statistics updates than, say, a column containing URLs for pages accessed on a website. The URL column might receive changes just as often, but the statistical distribution of its values probably changes relatively slowly.

It is possible to run ANALYZE on specific tables and even just specific columns of a table, so the flexibility exists to update some statistics more frequently than others if your application requires it. In practice, however, it is usually best to just analyze the entire database, because it is a fast operation. ANALYZE uses a statistically random sampling of the rows of a table rather than reading every single row.

Tip

Although per-column tweaking of ANALYZE frequency might not be very productive, you might find it worthwhile to do per-column adjustment of the level of detail of the statistics collected by ANALYZE. Columns that are heavily used in WHERE clauses and have highly irregular data distributions might require a finer-grain data histogram than other columns. See ALTER TABLE SET STATISTICS, or change the database-wide default using the default_statistics_target configuration parameter.

Also, by default there is limited information available about the selectivity of functions. However, if you create a statistics object or an expression index that uses a function call, useful statistics will be gathered about the function, which can greatly improve query plans that use the expression index.

Tip

The autovacuum daemon does not issue ANALYZE commands for foreign tables, since it has no means of determining how often that might be useful. If your queries require statistics on foreign tables for proper planning, it's a good idea to run manually-managed ANALYZE commands on those tables on a suitable schedule.

Tip

The autovacuum daemon does not issue ANALYZE commands for partitioned tables. Inheritance parents will only be analyzed if the parent itself is changed - changes to child tables do not trigger autoanalyze on the parent table. If your queries require statistics on parent tables for proper planning, it is necessary to periodically run a manual ANALYZE on those tables to keep the statistics up to date.



[15] In this regard, freezing is unlike setting transaction status hint bits in tuple headers: setting hint bits doesn't usually need to be WAL-logged, and can take place on physical replication standby servers without input from the primary server. Hint bits exist to allow query execution to avoid repeated pg_xact lookups for the same tuples, strictly as an optimization. On the other hand, freezing exists because the system needs to reliably remove pg_xact dependencies from individual tuples.

[16] Freezing of xmax fields (whether they contain an XID or a Multixact ID) generally means clearing xmax from a tuple header. VACUUM may occasionally encounter an individual Multixact ID that must be removed to advance the table's relminmxid by the required amount, which can only be processed by generating a replacement Multixact ID (containing just the non-removable subset of member XIDs from the original Multixact ID), and then setting xmax to the new/replacement Multixact ID value.

[17] Actually, the freeze on an FPI write mechanism isn't just used when VACUUM needs to generate an FPI (as torn page protection) for inclusion in a WAL record describing how dead tuples were removed. The FPI mechanism also triggers when hint bits are set by VACUUM, if and only if it necessitates writing an FPI. The need to write a WAL record to set hint bits only arises when wal_log_hints is enabled in postgresql.conf, or when data checksums were enabled when the cluster was initialized with initdb.

[18] Autovacuum's log reports appear in the server log for autovacuums whose VACUUM takes longer than a threshold controlled by log_autovacuum_min_duration. Manual VACUUMs output the same details as INFO messages when the VACUUM command's VERBOSE option is used (note that manual VACUUMs never generate reports in the server log).

[19] Aggressive VACUUM is (somewhat arbitrarily) required to freeze all pages containing transaction IDs older than vacuum_freeze_min_age and/or Multixact ID values older than vacuum_multixact_freeze_min_age, at a minimum.

[20] Autovacuum may use a lower effective Multixact ID age than the autovacuum_multixact_freeze_max_age setting in postgresql.conf, though. Applying a lower effective value like this avoids allowing the pg_multixact/members SLRU storage area to continue to grow in size for long, once its size reaches 2GB. See Section 25.2.4.

[21] Anti-wraparound autovacuum is all but guaranteed to advance the oldest relfrozenxid/relminmxid (and therefore to advance datfrozenxid/datminmxid) in practice because in practice there is all but guaranteed to be at least one totally static table that never gets an aggressive VACUUM for any other reason (often just a tiny, completely static system catalog table).