Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2022-08-30 20:45:19
Message-ID: CAH2-WznREt2q3P+D7bq4b0ytG_z6EOQUPNyoLB+CtEactZJPiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 30, 2022 at 11:11 AM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> The solution involves more changes to the philosophy and mechanics of
> vacuum than I would expect, though. For instance, VM snapshotting,
> page-level-freezing, and a cost model all might make sense, but I don't
> see why they are critical for solving the problem above.

I certainly wouldn't say that they're critical. I tend to doubt that I
can be perfectly crisp about what the exact relationship is between
each component in isolation and how it contributes towards addressing
the problems we're concerned with.

> I think I'm
> still missing something. My mental model is closer to the bgwriter and
> checkpoint_completion_target.

That's not a bad starting point. The main thing that that mental model
is missing is how the timeframes work with VACUUM, and the fact that
there are multiple timeframes involved (maybe the system's vacuuming
work could be seen as having one timeframe at the highest level, but
it's more of a fractal picture overall). Checkpoints just don't take
that long, and checkpoint duration has a fairly low variance (barring
pathological performance problems).

You only have so many buffers that you can dirty, too -- it's a
self-limiting process. This is even true when (for whatever reason)
the checkpoint_completion_target logic just doesn't do what it's
supposed to do. There is more or less a natural floor on how bad
things can get, so you don't have to invent a synthetic floor at all.
LSM-based DB systems like the MyRocks storage engine for MySQL don't
use checkpoints at all -- the closest analog is compaction, which is
closer to a hybrid of VACUUM and checkpointing than anything else.

The LSM compaction model necessitates adding artificial throttling to
keep the system stable over time [1]. There is a disconnect between
the initial ingest of data, and the compaction process. And so
top-down modelling of costs and benefits with compaction is more
natural with an LSM [2] -- and not a million miles from the strategy
stuff I'm proposing.

> Allow me to make a naive counter-proposal (not a real proposal, just so
> I can better understand the contrast with your proposal):

> I know there would still be some problem cases, but to me it seems like
> we solve 80% of the problem in a couple dozen lines of code.

It's not that this statement is wrong, exactly. It's that I believe
that it is all but mandatory for me to ameliorate the downside that
goes with more eager freezing, for example by not doing it at all when
it doesn't seem to make sense. I want to solve the big problem of
freeze debt, without creating any new problems. And if I should also
make things in adjacent areas better too, so much the better.

Why stop at a couple of dozens of lines of code? Why not just change
the default of vacuum_freeze_min_age and
vacuum_multixact_freeze_min_age to 0?

> a. Can you clarify some of the problem cases, and why it's worth
> spending more code to fix them?

For one thing if we're going to do a lot of extra freezing, we really
want to "get credit" for it afterwards, by updating relfrozenxid to
reflect the new oldest extant XID, and so avoid getting an
antiwraparound VACUUM early, in the near future.

That isn't strictly true, of course. But I think that we at least
ought to have a strong bias in the direction of updating relfrozenxid,
having decided to do significantly more freezing in some particular
VACUUM operation.

> b. How much of your effort is groundwork for related future
> improvements? If it's a substantial part, can you explain in that
> larger context?

Hard to say. It's true that the idea of VM snapshots is quite general,
and could have been introduced in a number of different ways. But I
don't think that that should count against it. It's also not something
that seems contrived or artificial -- it's at least as good of a
reason to add VM snapshots as any other I can think of.

Does it really matter if this project is the freeze debt project, or
the VM snapshot project? Do we even need to decide which one it is
right now?

> c. Can some of your patches be separated into independent discussions?
> For instance, patch 1 has been discussed in other threads and seems
> independently useful, and I don't see the current work as dependent on
> it.

I simply don't know if I can usefully split it up just yet.

> Patch 4 also seems largerly independent.

Patch 4 directly compensates for a problem created by the earlier
patches. The patch series as a whole isn't supposed to amerliorate the
problem of MultiXacts being allocated in VACUUM. It only needs to
avoid making the situation any worse than it is today IMV (I suspect
that the real fix is to make the VACUUM FREEZE command not tune
vacuum_freeze_min_age).

> d. Can you help give me a sense of scale of the problems solved by
> visibilitymap snapshots and the cost model? Do those need to be in v1?

I'm not sure. I think that having certainty that we'll be able to scan
only so many pages up-front is very broadly useful, though. Plus it
removes the SKIP_PAGES_THRESHOLD stuff, which was intended to enable
relfrozenxid advancement in non-aggressive VACUUMs, but does so in a
way that results in scanning many more pages needlessly. See commit
bf136cf6, which added the SKIP_PAGES_THRESHOLD stuff back in 2009,
shortly after the visibility map first appeared.

Since relfrozenxid advancement fundamentally works at the table level,
it seems natural to make it a top-down, VACUUM-level thing -- even
within non-aggessive VACUUMs (I guess it already meets that
description in aggressive VACUUMs). And since we really want to
advance relfrozenxid when we do extra freezing (for the reasons I just
went into), it seems natural to me to view it as one problem. I accept
that it's not clear cut, though.

[1] https://docs.google.com/presentation/d/1WgP-SlKay5AnSoVDSvOIzmu7edMmtYhdywoa0oAR4JQ/edit?usp=sharing
[2] https://disc-projects.bu.edu/compactionary/research.html
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2022-08-30 21:08:02 Re: Slight refactoring of state check in pg_upgrade check_ function
Previous Message Bruce Momjian 2022-08-30 20:42:32 Re: First draft of the PG 15 release notes