Overhauling "Routine Vacuuming" docs, particularly its handling of freezing

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
Date: 2023-04-24 21:57:57
Message-ID: CAH2-Wzm_vCegKSwUOG2H7368=E8yuF4+mAxaK4RDj=+2_Puzmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

My work on page-level freezing for PostgreSQL 16 has some remaining
loose ends to tie up with the documentation. The "Routine Vacuuming"
section of the docs has no mention of page-level freezing. It also
doesn't mention the FPI optimization added by commit 1de58df4. This
isn't a small thing to leave out; I fully expect that the FPI
optimization will very significantly alter when and how VACUUM
freezes. The cadence will look quite a lot different.

It seemed almost impossible to fit in discussion of page-level
freezing to the existing structure. In part this is because the
existing documentation emphasizes the worst case scenario, rather than
talking about freezing as a maintenance task that affects physical
heap pages in roughly the same way as pruning does. There isn't a
clean separation of things that would allow me to just add a paragraph
about the FPI thing.

Obviously it's important that the system never enters xidStopLimit
mode -- not being able to allocate new XIDs is a huge problem. But it
seems unhelpful to define that as the only goal of freezing, or even
the main goal. To me this seems similar to defining the goal of
cleaning up bloat as avoiding completely running out of disk space;
while it may be "the single most important thing" in some general
sense, it isn't all that important in most individual cases. There are
many very bad things that will happen before that extreme worst case
is hit, which are far more likely to be the real source of pain.

There are also very big structural problems with "Routine Vacuuming",
that I also propose to do something about. Honestly, it's a huge mess
at this point. It's nobody's fault in particular; there has been
accretion after accretion added, over many years. It is time to
finally bite the bullet and do some serious restructuring. I'm hoping
that I don't get too much push back on this, because it's already very
difficult work.

Attached patch series shows what I consider to be a much better
overall structure. To make this convenient to take a quick look at, I
also attach a prebuilt version of routine-vacuuming.html (not the only
page that I've changed, but the most important set of changes by far).

This initial version is still quite lacking in overall polish, but I
believe that it gets the general structure right. That's what I'd like
to get feedback on right now: can I get agreement with me about the
general nature of the problem? Does this high level direction seem
like the right one?

The following list is a summary of the major changes that I propose:

1. Restructures the order of items to match the actual processing
order within VACUUM (and ANALYZE), rather than jumping from VACUUM to
ANALYZE and then back to VACUUM.

This flows a lot better, which helps with later items that deal with
freezing/wraparound.

2. Renamed "Preventing Transaction ID Wraparound Failures" to
"Freezing to manage the transaction ID space". Now we talk about
wraparound as a subtopic of freezing, not vice-versa. (This is a
complete rewrite, as described by later items in this list).

3. All of the stuff about modulo-2^32 arithmetic is moved to the
storage chapter, where we describe the heap tuple header format.

It seems crazy to me that the second sentence in our discussion of
wraparound/freezing is still:

"But since transaction IDs have limited size (32 bits) a cluster that
runs for a long time (more than 4 billion transactions) would suffer
transaction ID wraparound: the XID counter wraps around to zero, and
all of a sudden transactions that were in the past appear to be in the
future"

Here we start the whole discussion of wraparound (a particularly
delicate topic) by describing how VACUUM used to work 20 years ago,
before the invention of freezing. That was the last time that a
PostgreSQL cluster could run for 4 billion XIDs without freezing. The
invariant is that we activate xidStopLimit mode protections to avoid a
"distance" between any two unfrozen XIDs that exceeds about 2 billion
XIDs. So why on earth are we talking about 4 billion XIDs? This is the
most confusing, least useful way of describing freezing that I can
think of.

4. No more separate section for MultiXactID freezing -- that's
discussed as part of the discussion of page-level freezing.

Page-level freezing takes place without regard to the trigger
condition for freezing. So the new approach to freezing has a fixed
idea of what it means to freeze a given page (what physical
modifications it entails). This means that having a separate sect3
subsection for MultiXactIds now makes no sense (if it ever did).

5. The top-level list of maintenance tasks has a new addition: "To
truncate obsolescent transaction status information, when possible".

It makes a lot of sense to talk about this as something that happens
last (or last among those steps that take place during VACUUM). It's
far less important than avoiding xidStopLimit outages, obviously
(using some extra disk space is almost certainly the least of your
worries when you're near to xidStopLimit). The current documentation
seems to take precisely the opposite view, when it says the following:

"The sole disadvantage of increasing autovacuum_freeze_max_age (and
vacuum_freeze_table_age along with it) is that the pg_xact and
pg_commit_ts subdirectories of the database cluster will take more
space"

This sentence is dangerously bad advice. It is precisely backwards. At
the same time, we'd better say something about the need to truncate
pg_xact/clog here. Besides all this, the new section for this is a far
more accurate reflection of what's really going on: most individual
VACUUMs (even most aggressive VACUUMs) won't ever truncate
pg_xact/clog (or the other relevant SLRUs). Truncation only happens
after a VACUUM that advances the relfrozenxid of the table which
previously had the oldest relfrozenxid among all tables in the entire
cluster -- so we need to talk about it as an issue with the high
watermark storage for pg_xact.

6. Rename the whole "Routine Vacuuming" section to "Autovacuum
Maintenance Tasks".

This is what we should be emphasizing over manually run VACUUMs.
Besides, the current title just seems wrong -- we're talking about
ANALYZE just as much as VACUUM.

Thoughts?

--
Peter Geoghegan

Attachment Content-Type Size
routine-vacuuming.html text/html 42.7 KB
v1-0009-Overhaul-Recovering-Disk-Space-vacuuming-docs.patch application/octet-stream 10.7 KB
v1-0008-Overhaul-freezing-and-wraparound-docs.patch application/octet-stream 47.5 KB
v1-0006-Merge-basic-vacuuming-sect2-into-sect1-introducti.patch application/octet-stream 6.2 KB
v1-0007-Make-maintenance.sgml-more-autovacuum-orientated.patch application/octet-stream 7.4 KB
v1-0003-Normalize-maintenance.sgml-indentation.patch application/octet-stream 4.7 KB
v1-0004-Reorder-routine-vacuuming-sections.patch application/octet-stream 16.4 KB
v1-0002-Restructure-autuovacuum-daemon-section.patch application/octet-stream 5.3 KB
v1-0005-Move-Interpreting-XID-stamps-from-tuple-headers.patch application/octet-stream 9.3 KB
v1-0001-Make-autovacuum-docs-into-a-sect1-of-its-own.patch application/octet-stream 17.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-04-24 22:13:00 Re: pg_stat_io not tracking smgrwriteback() is confusing
Previous Message Melanie Plageman 2023-04-24 21:37:48 Re: pg_stat_io not tracking smgrwriteback() is confusing