Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Ryan Booz <ryan(at)softwareandbooz(dot)com>
Cc: samay sharma <smilingsamay(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
Date: 2023-05-13 02:40:26
Message-ID: CAH2-Wz=tU-4T=jDbwPQZdjLpORbA6+OEmns6qQPu2i7dDhyhwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 12, 2023 at 10:36 AM Ryan Booz <ryan(at)softwareandbooz(dot)com> wrote:
> Just to say on the outset, as has been said earlier in the tread by others,
> that this is herculean work. Thank you for putting the effort you have thus far.

Thanks!

> > It would be nice if it was possible to add an animation/diagram a
> > little like this one: https://tuple-freezing-demo.angusd.com (this is
> > how I tend to think about the "transaction ID space".)
>
> Indeed. With volunteer docs, illustrations/diagrams are hard for sure. But,
> this or something akin to the "clock" image I've seen elsewhere when
> describing the transaction ID space would probably be helpful if it were ever
> possible. In fact, there's just a lot about the MVCC stuff in general that
> would benefit from diagrams. But alas, I guess that's why we have some
> good go-to community talks/slide decks. :-)

A picture is worth a thousand words. This particular image may be
worth even more, though.

It happens to be *exactly* what I'd have done if I was tasked with
coming up with an animation that conveys the central ideas. Obviously
I brought this image up because I think that it would be great if we
could find a way to do something like that directly (not impossible,
there are a few images already). However, there is a less obvious
reason why I brought it to your attention: it's a very intuitive way
of understanding what I actually intend to convey through words -- at
least as far as talk about the cluster-wide XID space is concerned. It
might better equip you to review the patch series.

Sure, the animation will make the general idea clearer to just about
anybody -- that's a big part of what I like about it. But it also
captures the nuance that might matter to experts (e.g., the oldest XID
moves forward in jerky discrete jumps, while the next/unallocated XID
moves forward in a smooth, continuous fashion). So it works on
multiple levels, for multiple audiences/experience levels, without any
conflicts -- which is no small thing.

Do my words make you think of something a little like the animation?
If so, good.

> Thanks again for doing this. Really helpful for doc newbies like me that
> want to help but are still working through the process. Really helpful
> and appreciated.

I think that this is the kind of thing that particularly benefits from
diversity in perspectives.

> Agree. This flows fairly well and helps the user understand that each
> "next step"
> in the vacuum/freezing process has a distinct job based on previous work.

I'm trying to make it possible to read in short bursts, and to skim.
The easiest wins in this area will come from simply having more
individual sections/headings, and a more consistent structure. The
really difficult part is coming up with prose that can sort of work
for all audiences at the same time -- without alienating anybody.

Here is an example of what I mean:

The general idea of freezing can reasonably be summarized as "a
process that VACUUM uses to make pages self-contained (no need to do
pg_xact lookups anymore), that also has a role in avoiding transaction
ID exhaustion". That is a totally reasonable beginner-level (well,
relative-beginner-level) understanding of freezing. It *isn't* dumbed
down. You, as a beginner, have a truly useful take-away. At the same
time, you have avoided learning anything that you'll need to unlearn
some day. If I can succeed in doing that, I'll feel a real sense of
accomplishment.

> > * Much improved "Truncating Transaction Status Information" subsection.
> >
> > My explanation of the ways in which autovacuum_freeze_max_age can
> > affect the storage overhead of commit/abort status in pg_xact is much
> > clearer than it was in v3 -- pg_xact truncation is now treated as
> > something loosely related to the global config of anti-wraparound
> > autovacuum, which makes most sense.
> >
> This one isn't totally sinking in with me yet. Need another read.

"Truncating Transaction Status Information" is explicitly supposed to
matter much less than the rest of the stuff on freezing. The main
benefit that the DBA can expect from understanding this content is how
to save a few GB of disk space for pg_xact, which isn't particularly
likely to be possible, and is very very unlikely to be of any real
consequence, compared to everything else. If you were reading the
revised "Routine Vacuuming" as the average DBA, what you'd probably
have ended up doing is just not reading this part at all. And that
would probably be the ideal outcome. It's roughly the opposite of what
you'll get right now, by the way (bizarrely, the current docs place a
great deal of emphasis on this).

(Of course I welcome your feedback here too. Just giving you the context.)

> > It took a great deal of effort to find a structure that covered
> > everything, and that highlighted all of the important relationships
> > without going too far, while at the same time not being a huge mess.
> > That's what I feel I've arrived at with v4.
>
> In most respects I agree with the overall flow of changes w.r.t the current doc.
> Focusing on all of this as something that should normally just be happening
> as part of autovacuum is helpful. Working through it as an order of operations
> (and I'm just assuming this is the general order) feels like it ties
> things together
> a lot more. I honestly come away from this document with more of a "I understand
> the process" feel than I did previously.

That's great news. It might be helpful to give you more context about
the particular approach I've taken here, and how it falls short of
what I'd ideally like to do, in my own mind.

There are some rather strange things that happen to be true about
VACUUM and freezing today, that definitely influenced the way I
structured the docs. I can imagine an improved version of VACUUM that
is not so different to the real VACUUM that we have today (one that
still has freezing as we know it), that still has a much simpler UI --
some policy-based process for deciding which pages to freeze that was
much smarter than a simple trigger. If we were living in a world where
VACUUM actually worked like that, then I'd have been able to come up
with a structure that is a lot closer to what you might have been
hoping for from this patch series. At the very least, I'd have been
able to add some "TL;DR" text at the start of each section, that just
gave the main practical takeaway.

Take vacuum_freeze_min_age. It's a *really* bad design, even on its
own terms, even if we assume that nothing can change about how
freezing works. Yet it's probably still the most important
freezing-related GUC, even in Postgres 16. History matters here. The
GUC was invented in a world before the visibility map existed. When
the visibility map was invented, aggressive VACUUM was also invented
(before then the name for what we now call "aggressive VACUUM" was
actually just "VACUUM"). This development utterly changed the way that
vacuum_freeze_min_age actually works, but we still talk about it as if
its idea of "age" can be considered in isolation, as a universal
catch-all that can be tuned iteratively. The reality is that it is
interpreted in a way that is *hopelessly* tied to other things.

This isn't a minor point. There are really bizarre implications, with
real practical consequences. For example, suppose you want to make
autovacuums run more often against a simple append-only table -- so
you lower autovacuum_vacuum_insert_scale_factor with that in mind.
It's entirely possible that you'll now do *less* useful work, even
though you specifically set out to vacuum more aggressively! This is
due to the way the GUCs interact with each other, of course: the more
often VACUUM runs, the less likely it is that it'll find XIDs before
vacuum_freeze_min_age to trigger freezing during any individual VACUUM
operation, the less useful work you'll do (you'll just accumulate
unfrozen all-visible pages until you finally have an aggressive
VACUUM).

This is exactly as illogical as it sounds. Postgres 16 will be the
first version that even shows instrumentation around freezing at all
in the log reports from autovacuum. This will be a real eye-opener, I
suspect -- I predict that people will be surprised at how freezing
works with their workload, when they finally have the opportunity to
see it for themselves.

> Owing to the lingering belief that many users have whereby hosting providers
> have magically enabled Postgres to do all of this for you, there is
> still a need to
> actively deal with these thresholds based on load. That is, as far as
> I understand,
> Postgres doesn't automatically adjust based on load. Someone/thing
> still has to modify
> the thresholds as load and data size changes.

Well, vacuum_freeze_min_age (anything based on XID age) runs into the
following problem: what is the relationship between XID age, and
freezing work? This is a question whose answer is much too
complicated, suggesting that it's just the wrong question. There is
absolutely no reason to expect a linear relationship (or anything like
it) between XIDs consumed and WAL required to freeze rows from those
XIDs. It's a totally chaotic thing.

The reason for this is: of course it is, why wouldn't it be? On Monday
you'll do a bulk load, and 1 XID will write 1TB to one table. On
Tuesday, there might be only one row per XID consumed, with millions
and millions of rows inserted. This is 100% common sense, and yet is
kinda at odds with the whole idea of basing the decision to freeze on
age (as if vacuum_freeze_min_age didn't have enough problems
already!).

For now, I think our best bet is to signal the importance of avoiding
disaster to intermediate users, and signal the importance of iterative
tuning to advanced users.

> If the "workload requirements" is pointing towards aggressive
> freezing/wraparound
> tasks that happen regardless of thresholds, then for me at least that
> isn't clear
> in that sentence and it feels like there's an implication that
> Postgres/autovacuum
> is going to magically adjust overall vacuum work based on database workload.

That's a good point.

> 2) "The intended audience is database administrators that wish to
> perform more advanced
> autovacuum tuning, with any of the following goals in mind:"
>
> I love calling out the audience in some way. That's really helpful, as are the
> stated goals in the bullet list. However, as someone feeling pretty novice
> after reading all of this, I can't honestly connect how the content on this page
> helps me to more advanced tuning.

You're right to point that out; the actual content here was written
half-heartedly, in part because it depends on the dead-tuple-space
patch, which is not my focus at all right now.

Here is what I'd like the message to be, roughly:

1. This isn't something that you read once. You read it in small
bites. You come back to it from time to time (or you will if you need
to).

At one point Samay said: "I'll give my perspective as someone who has
not read the vacuum code but have learnt most of what I know about
autovacuum / vacuuming by reading the "Routine Vacuuming" page 10s of
times". I fully expect that a minority of users will want to do the
same with these revised docs. The content is very much not supposed to
be read through in one sitting (not if you expect to get any value out
of it). It is very calorie dense, and I don't think that that's really
a problem to be solved.

You have a much better chance of getting value out of it if you as a
user refer back to it as problems emerge. Some things may only click
after the second or third read, based on the experience of trying to
put something else into action in production.

2. If you don't like that it's calorie dense, then that's probably
okay -- just don't read past the parts that seem useful.

3. There are one or two exceptions (e.g., the "Tip" about freezing for
append-only tables), but overall there isn't going to be a simple
formula to follow -- the closest thing might be "don't bother doing
anything until it proves necessary".

This is because too much depends on individual workload requirements.
It is also partly due to it just being really hard to tune things like
vacuum_freeze_min_age very well right now.

4. It's an applied process. The emphasis should be on solving
practical, observed problems that are directly observed -- this isn't
a cookbook (though there are a couple of straightforward recipes,
covering one or two specific things).

> ** Section 2.5.1 - Recovering Disk Space **

It should be noted that what I've done in this area is quite
incomplete. I have only really done structural things here, and some
of these may not be much good.

> It feels like one connection you could make to the bullet list above
> is in this area
> and not mentioned. By freeing up space and reducing the number of pages that
> need to be read for satisfying a query, vacuum and recovering disk space
> (theoretically) improves query performance. Not 100% how to add it in context
> of these first two paragraphs.

It's hard, because it's not so much that vacuuming improves query
performance. It's more like *not* vacuuming hurts it. The exact point
that it starts to hurt is rather hard to predict -- and there might be
little point in trying to predict it with precision.

I tend to think that I'd probably be better off saying nothing about
query response times. Or saying something negative (what to definitely
avoid), not something positive (what to do) -- I would expect it to
generalize a lot better that way.

> 4) Caution: "It may be a good idea to add monitoring to alert you about this."
> I hate to be pedantic about it, but I think we should spell out
> "this". Do we have
> a pointer in documentation to what kinds of things to monitor for? Am monitoring
> long-running transactions or some metric that shows me that VACUUM is being
> "held back"? I know what you mean, but it's not clear to me how to do the right
> thing in my environment here.

Will do.

> 5) The plethora of tips/notes/warnings.
> As you and others have mentioned, as presented these really have no context
> for me. Individually they are good/helpful information, but it's
> really hard to make
> a connection to what I should "do" about it.

Yeah, I call that out in the relevant commit message of the patch as
bad, as temporary.

> It seems to me that this would be a good place to put a subsection which is
> something like, "A note about reclaiming disk space" or something. In my
> experience, most people hear about and end up using VACUUM FULL because
> things got out of control and they want to get into a better spot (I have been
> in that boat). I think with a small section that says, in essence,
> "hey, now that
> you understand why/how vacuum reclaims disk resources normally, if you're
> in a position where things aren't in a good state, this is what you need to know
> if you want to reclaim space from a really inefficient table"
>
> For me, at least, I think it would be easier to read/grok what you're sharing in
> these callouts.

That's the kind of thing that I had planned on with VACUUM FULL,
actually. You know, once I'm done with freezing. There is passing
mention of this in the relevant commit message.

> In my experience, one of the biggest issues with the thresholds and recovering
> space is the idea of tuning individual tables, not just the entire
> database. 5/10/20%
> might be fine for most tables, but it's usually the really active ones
> that need the
> tuning, specifically lowering the thresholds. That doesn't come across to me in
> this section at all. Again, maybe I've missed something on another page and
> it's all good, but it felt worth calling out.

I think that that's true. The rules are kind of different for larger tables.

> read and inspect excessive amounts of data at query time or when normal vacuum
> maintenance kicks in, VACUUM has a second job called freezing, which
> accomplishes three goals: (attempting to tie in the three sections)
> * speeding up queries and vacuum operations by...
> * advancing the transaction ID space on generally static tables...
> * ensure there are always free transaction IDs available for normal
> operation...
> "
>
> Maybe totally worthless and too much, but something like that might set a reader
> up for just a bit more context. Then you could take most of what comes before
> "2.5.2.2.1 Aggressive Vacuum" as a subsection (would require a renumber below)
> with something like "2.5.2.2.1 Normal Freezing Activity"

I think I know what you mean. What I've tried to do here is start with
freezing, and describe it as something that has immediate benefits,
that can be understood as useful, independently of its role in
advancing relfrozenxid later on. So now you wonder: what specific
benefits do I get?

It's hard to be too concrete about those benefits, because you have
things like hint bits. I could say something like that, but I think
I'd have to hedge too much, because you also have hint bits, that help
query response times in roughly the same way (albeit less reliably,
albeit without being set on physical replication standbys when they're
set on the primary).

> 8) Note "In PostgreSQL versions before 16..."
> Showing my naivety, somehow this isn't connecting with me totally. If
> it's important
> to call out, then maybe we need a connecting sentence. Based on the content
> above, I think you're pointing to "It's also why VACUUM will freeze all eligible
> tuples from a heap page once the decision to freeze at least one tuple
> is taken:"
> If that's it, it's just not clear to me what's totally changed. Sorry,
> more learning. :-)

In Postgres 15, vacuum_freeze_min_age was applied in a way that only
froze whatever XIDs could be frozen from the page -- so if you had
half the tuples that were older, and half that were younger, you'd
only freeze the older half. Even when it might have cost you
practically nothing to freeze them all in one go. Now, as the text
you've quoted points out, vacuum_freeze_min_age triggers freezing at
the level of whole pages, including for new XIDs (though only if
they're eligible to be frozen, meaning that everybody agrees that
they're all visible now). So vacuum_freeze_min_age picks pages to
freeze, not individual tuples to freeze (this optimization is so
obvious that it's a little surprising that it took as long as it did
to get in).

Page-level freezing justifies the following statement from the patch,
for example:

"It doesn't matter if it was vacuum_freeze_table_age or
vacuum_multixact_freeze_table_age that made VACUUM use its aggressive
strategy. Every aggressive VACUUM will advance relfrozenxid and
relminmxid by applying the same generic policy that controls which
pages are frozen."

Now, since freezing works at the level of physical heap pages in 16,
the thing that triggers aggressive VACUUM matters less (just as the
thing that triggers freezing of individual pages matters much less --
freezing is freezing). There is minimal risk of freezing the same page
3 times during each of 3 different aggressive VACUUMs. To a much
greater extent, 3 aggressive VACUUMs isn't that different to only 1
aggressive VACUUM for those pages that were already "settled" from the
start. As a result, the addition of page-level freezing made
vacuum_freeze_min_age somewhat less bad -- in 16, its behavior was a
little less dependent on the phase of the moon (especially during
aggressive VACUUMs).

I really value stuff like that -- cases where you as a user can think
of something as independent to some other thing that you also need to
tune. There needs to be a lot more such improvements, but at least we
have this one now.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Xiaoran Wang 2023-05-13 04:03:53 Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog
Previous Message Thomas Munro 2023-05-13 00:48:49 Re: Large files for relations