Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-06 23:28:16
Message-ID: CAH2-Wz=48Z2YrSMfuaH5aiYTFU_ofs3gXUYtz4up1vetvYa9Rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 5, 2023 at 10:19 AM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> Could this use something like autovacuum_cost_delay? I don't quite
> like the use of arbitrary hardcoded millisecond delays

It's not unlike (say) the way that there can sometimes be hardcoded
waits inside GetMultiXactIdMembers(), which does run during VACUUM.

It's not supposed to be noticeable at all. If it is noticeable in any
practical sense, then the design is flawed, and should be fixed.

> it can slow a
> system down by a significant fraction, especially on high-contention
> systems, and this potential of 60ms delay per scanned page can limit
> the throughput of this new vacuum strategy to < 17 pages/second
> (<136kB/sec) for highly contended sections, which is not great.

We're only willing to wait the full 60ms when smaller waits don't work
out. And when 60ms doesn't do it, we'll then accept an older final
NewRelfrozenXid value. Our willingness to wait at all is conditioned
on the existing NewRelfrozenXid tracker being affected at all by
whether or not we accept reduced lazy_scan_noprune processing for the
page. So the waits are naturally self-limiting.

You may be right that I need to do more about the possibility of
something like that happening -- it's a legitimate concern. But I
think that this may be enough on its own. I've never seen a workload
where more than a small fraction of all pages couldn't be cleanup
locked right away. But I *have* seen workloads where VACUUM vainly
waited forever for a cleanup lock on one single heap page.

> It is also not unlikely that in the time it was waiting, the page
> contents were updated significantly (concurrent prune, DELETEs
> committed), which could result in improved bounds. I think we should
> redo the dead items check if we waited, but failed to get a lock - any
> tuples removed now reduce work we'll have to do later.

I don't think that it matters very much. That's always true. It seems
very unlikely that we'll get better bounds here, unless it happens by
getting a full cleanup lock and then doing full lazy_scan_prune
processing after all.

Sure, it's possible that a concurrent opportunistic prune could make
the crucial difference, even though we ourselves couldn't get a
cleanup lock despite going to considerable trouble. I just don't think
that it's worth doing anything about.

> > +++ b/doc/src/sgml/ref/vacuum.sgml
> > [...] Pages where
> > + all tuples are known to be frozen are always skipped.
>
> "...are always skipped, unless the >DISABLE_PAGE_SKIPPING< option is used."

I'll look into changing this.

> > +++ b/doc/src/sgml/maintenance.sgml
>
> There are a lot of details being lost from the previous version of
> that document. Some of the details are obsolete (mentions of
> aggressive VACUUM and freezing behavior), but others are not
> (FrozenTransactionId in rows from a pre-9.4 system, the need for
> vacuum for prevention of issues surrounding XID wraparound).

I will admit that I really hate the "Routine Vacuuming" docs, and
think that they explain things in just about the worst possible way.

I also think that this needs to be broken up into pieces. As I said
recently, the docs are the part of the patch series that is the least
worked out.

> I also am not sure this is the best place to store most of these
> mentions, but I can't find a different place where these details on
> certain interesting parts of the system are documented, and plain
> removal of the information does not sit right with me.

I'm usually the person that argues for describing more implementation
details in the docs. But starting with low-level details here is
deeply confusing. At most these are things that should be discussed in
the context of internals, as part of some completely different
chapter.

I'll see about moving details of things like FrozenTransactionId somewhere else.

> Specifically, I don't like the removal of the following information
> from our documentation:
>
> - Size of pg_xact and pg_commit_ts data in relation to autovacuum_freeze_max_age
> Although it is less likely with the new behaviour that we'll hit
> these limits due to more eager freezing of transactions, it is still
> important for users to have easy access to this information, and
> tuning this for storage size is not useless information.

That is a fair point. Though note that these things have weaker
relationships with settings like autovacuum_freeze_max_age now. Mostly
this is a positive improvement (in the sense that we can truncate
SLRUs much more aggressively on average), but not always.

> - The reason why VACUUM is essential to the long-term consistency of
> Postgres' MVCC system
> Informing the user about our use of 32-bit transaction IDs and
> that we update an epoch when this XID wraps around does not
> automatically make the user aware of the issues that surface around
> XID wraparound. Retaining the explainer for XID wraparound in the docs
> seems like a decent idea - it may be moved, but please don't delete
> it.

We do need to stop telling users to enter single user mode. It's quite
simply obsolete, bad advice, and has been since Postgres 14. It's the
worst thing that you could do, in fact.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-01-07 00:34:03 Re: Transparent column encryption
Previous Message Andres Freund 2023-01-06 23:25:12 Re: Using WaitEventSet in the postmaster