Re: Eager page freeze criteria clarification

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject: Re: Eager page freeze criteria clarification
Date: 2023-09-27 21:26:42
Message-ID: CAH2-Wzk8w9VWYubH7e388yURBT5wKD8_M2G2E-darvPhUTcf-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 27, 2023 at 1:45 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-09-27 13:14:41 -0700, Peter Geoghegan wrote:
> > As a general rule, I think that we're better of gambling against
> > future FPIs, and then pulling back if we go too far. The fact that we
> > went one VACUUM operation without the workload unsetting an
> > all-visible page isn't that much of a signal about what might happen
> > to that page.
>
> I think we can afford to be quite aggressive about opportunistically freezing
> when doing so wouldn't emit an FPI.

I agree. This part is relatively easy -- it is more or less v2 of the
FPI thing. The only problem that I see is that it just isn't that
compelling on its own -- it just doesn't seem ambitious enough.

> I am much more concerned about cases where
> opportunistic freezing requires an FPI - it'll often *still* be the right
> choice to freeze the page, but we need a way to prevent that from causing a
> lot of WAL in worse cases.

What about my idea of holding back when some tuples are already frozen
from before? Admittedly that's still a fairly raw idea, but something
along those lines seems promising.

If you limit yourself to what I've called the easy cases, then you
can't really expect to make a dent in the problems that we see with
large tables -- where FPIs are pretty much the norm rather than the
exception. Again, that's fine, but I'd be disappointed if you and
Melanie can't do better than that for 17.

> > 2. Large tables (i.e. the tables where it really matters) just don't
> > have that many VACUUM operations, relative to everything else.
>
> I think we need to make vacuums on large tables much more aggressive than they
> are now, independent of opportunistic freezing heuristics. It's idiotic that
> on large tables we delay vacuuming until multi-pass vacuums are pretty much
> guaranteed.

Not having to do all of the freezing at once will often still make
sense in cases where we "lose". It's hard to precisely describe how to
assess such things (what's the break even point?), but that makes it
no less true. Constantly losing by a small amount is usually better
than massive drops in performance.

> The current logic made some sense when we didn't have the VM, but now
> autovacuum scheduling is influenced by the portion of the table that that
> vacuum will never look at, which makes no sense.

Yep:

https://www.postgresql.org/message-id/CAH2-Wz=MGFwJEpEjVzXwEjY5yx=UuNPzA6Bt4DSMasrGLUq9YA@mail.gmail.com

> > Who says we'll get more than one opportunity per page with these tables,
> > even with this behavior of scanning all-visible pages in non-aggressive
> > VACUUMs? Big append-only tables simply won't get the opportunity to catch
> > up in the next non-aggressive VACUUM if there simply isn't one.
>
> I agree that we need to freeze pages in append only tables ASAP. I don't think
> they're that hard a case to detect though. The harder case is the - IME very
> common - case of tables that are largely immutable but have a moving tail
> that's hotly updated.

Doing well with such "moving tail of updates" tables is exactly what
doing well on TPC-C requires. It's easy to detect that a table is
either one of these two things. Which of the two it is specifically
presents greater difficulties. So I don't see strict append-only
versus "append and update each row once" as all that different,
practically speaking, from the point of view of VACUUM.

Often, the individual pages from TPC-C's order lines table look very
much like pgbench_history style pages would -- just because VACUUM is
either ahead or behind the current "hot tail" position with almost all
pages that it scans, due to the current phase of the moon. This is
partly why I place so much emphasis on teaching VACUUM to understand
that what it sees in each page might well have a lot to do with when
it happened to show up, as opposed to something about the
workload/table itself.

BTW, if you're worried about the "hot tail" case in particular then
you should definitely put the FSM stuff in scope here -- it's a huge
part of the overall problem, since it makes the pages take so much
longer to "settle" than you might expect when just considering the
workload abstractly.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-09-27 21:30:10 Re: Streaming I/O, vectored I/O (WIP)
Previous Message Tom Lane 2023-09-27 20:52:44 Re: Annoying build warnings from latest Apple toolchain