Re: Eager page freeze criteria clarification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-09-27 21:44:38
Message-ID: CAH2-Wz=ObFH4yxO5BFY_r=jaOgupJgH7LbpdfDAz4CWXdd6inw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2023 at 2:26 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Wed, Sep 27, 2023 at 1:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I think we need to make vacuums on large tables much more aggressive than they
> > are now, independent of opportunistic freezing heuristics. It's idiotic that
> > on large tables we delay vacuuming until multi-pass vacuums are pretty much
> > guaranteed.
>
> Not having to do all of the freezing at once will often still make
> sense in cases where we "lose".

One more thing on this, and the subject of large table that keep
getting larger (including those with a "hot tail" of updates):

Since autovacuum runs against such tables at geometric intervals (as
determined by autovacuum_vacuum_insert_scale_factor), the next VACUUM
is always going to be longer and more expensive than this VACUUM,
forever (ignoring the influence of aggressive mode for a second). This
would even be true if we didn't have the related problem of
autovacuum_vacuum_insert_scale_factor not accounting for the fact that
when VACUUM starts and when VACUUM ends aren't exactly the same thing
in large tables [1] -- that aspect just makes the problem even worse.

Basically, the whole "wait and see" approach makes zero sense here
because we really do need to be aggressive about freezing just to keep
up with the workload. The number of pages we'll scan in the next
VACUUM will always be significantly larger, even if we're very
aggressive about freezing (theoretically it might not be, but then
what VACUUM does doesn't matter that much either way). Time is very
much not on our side here. So we need to anticipate what happens next
with the workload, and how that affects VACUUM in the future -- not
just how VACUUM affects the workload. (VACUUM is just another part of
the workload, in fact.)

[1] https://www.postgresql.org/message-id/CAH2-Wzn=bZ4wynYB0hBAeF4kGXGoqC=PZVKHeerBU-je9AQF=g@mail.gmail.com
--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-09-27 22:53:45 Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()
Previous Message Thomas Munro 2023-09-27 21:30:10 Re: Streaming I/O, vectored I/O (WIP)