Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-27 03:01:03
Message-ID: CAH2-WzmxkgjdwfDqa2qPcuUmicbGXCXMgPMCXhaJ2Fqfa1UdRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2023 at 6:37 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I also don't really see how that is responsive to anything else in my
> email. That's just as true for the current gating condition (the issuance of
> an FPI during heap_page_prune() / HTSV()).
>
> What I was wondering about is whether we should replace the
> fpi_before != pgWalUsage.wal_fpi
> with
> records_before != pgWalUsage.wal_records && !WouldIssueFpi(page)

I understand that. What I'm saying is that that's going to create a
huge problem of its own, unless you separately account for that
problem.

The simplest and obvious example is something like a pgbench_tellers
table. VACUUM will generally run fast enough relative to the workload
that it will set some of those pages all-visible. Now it's going to
freeze them, too. Arguably it shouldn't even be setting the pages
all-visible, but now you make that existing problem much worse.

The important point is that there doesn't seem to be any good way
around thinking about the table as a whole if you're going to freeze
speculatively. This is not the same dynamic as we see with the FPI
thing IMV -- that's not nearly so speculative as what you're talking
about, since it is speculative in roughly the same sense that eager
freezing was speculative (hence the suggestion that something like
vacuum_freeze_strategy_threshold could have a roll to play).

The FPI thing is mostly about the cost now versus the cost later on.
You're gambling that you won't get another FPI later on if you freeze
now. But the cost of a second FPI later on is so much higher than the
added cost of freezing now that that's a very favorable bet, that we
can afford to "lose" many times while still coming out ahead overall.
And even when we lose, you generally still won't have been completely
wrong -- even then there generally will indeed be a second FPI later
on for the same page, to go with everything else. This makes the
wasted freezing even less significant, on a comparative basis!

It's also likely true that an FPI in lazy_scan_prune is a much
stronger signal, but I think that the important dynamic is that we're
reasoning about "costs now vs costs later on". The asymmetry is really
important.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-01-27 03:07:36 Re: Generating code for query jumbling through gen_node_support.pl
Previous Message Michael Paquier 2023-01-27 02:59:47 Re: Generating code for query jumbling through gen_node_support.pl