Re: New strategies for freezing, advancing relfrozenxid early

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Date: 2023-01-26 18:44:45
Message-ID: CAH2-WzkbuGaW61LzAfj=Ge7YcEBYmgyZ6dwTdVDYXGXPt3c2pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 26, 2023 at 9:53 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I assume the case you're thinking of is that pruning did *not* do any changes,
> but in the process of figuring out that nothing needed to be pruned, we did a
> MarkBufferDirtyHint(), and as part of that emitted an FPI?

Yes.

> > That's going to be very significantly more aggressive. For example
> > it'll impact small tables very differently.
>
> Maybe it would be too aggressive, not sure. The cost of a freeze WAL record is
> relatively small, with one important exception below, if we are 99.99% sure
> that it's not going to require an FPI and isn't going to dirty the page.
>
> The exception is that a newer LSN on the page can cause the ringbuffer
> replacement to trigger more more aggressive WAL flushing. No meaningful
> difference if we modified the page during pruning, or if the page was already
> in s_b (since it likely won't be written out via the ringbuffer in that case),
> but if checksums are off and we just hint-dirtied the page, it could be a
> significant issue.

Most of the overhead of FREEZE WAL records (with freeze plan
deduplication and page-level freezing in) is generic WAL record header
overhead. Your recent adversarial test case is going to choke on that,
too. At least if you set checkpoint_timeout to 1 minute again.

> Thus a modification of the above logic could be to opportunistically freeze if
> a ) it won't cause an FPI and either
> b1) the page was already dirty before pruning, as we'll not do a ringbuffer
> replacement in that case
> or
> b2) We wrote a WAL record during pruning, as the difference in flush position
> is marginal
>
> An even more aggressive version would be to replace b1) with logic that'd
> allow newly dirtying the page if it wasn't read through the ringbuffer. But
> newly dirtying the page feels like it'd be more dangerous.

In many cases we'll have to dirty the page anyway, just to set
PD_ALL_VISIBLE. The whole way the logic works is conditioned (whether
triggered by an FPI or triggered by my now-reverted GUC) on being able
to set the whole page all-frozen in the VM.

> A less aggressive version would be to check if any WAL records were emitted
> during heap_page_prune() (instead of FPIs) and whether we'd emit an FPI if we
> modified the page again. Similar to what we do now, except not requiring an
> FPI to have been emitted.

Also way more aggressive. Not nearly enough on its own.

> But to me it seems a bit odd that VACUUM now is more aggressive if checksums /
> wal_log_hint bits is on, than without them. Which I think is how using either
> of pgWalUsage.wal_fpi, pgWalUsage.wal_records ends up working?

Which part is the odd part? Is it odd that page-level freezing works
that way, or is it odd that page-level checksums work that way?

In any case this seems like an odd thing for you to say, having
eviscerated a patch that really just made the same behavior trigger
independently of FPIs in some tables, controlled via a GUC.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-01-26 18:53:20 Re: Syncrep and improving latency due to WAL throttling
Previous Message Karl O. Pinc 2023-01-26 18:36:38 Re: drop postmaster symlink