Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Call lazy_check_wraparound_failsafe earlier for parallel vacuum
Date: 2022-11-10 23:09:23
Message-ID: CAH2-WznboZ+Z40u8i1+-9_7SLZRUg7jbsfV2KOaWugeVCTnYtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 10, 2022 at 10:20 AM Imseih (AWS), Sami <simseih(at)amazon(dot)com> wrote:
> Consistency is the key point here. It is odd that a serial
> vacuum may skip the remainder of the indexes if failsafe
> kicks-in, but in the parallel case it will go through the entire index
> cycle.

Yeah, it's a little inconsistent.

> > My gut instinct is that the most important thing is to at least call
> > lazy_check_wraparound_failsafe() once per index scan.
>
> I agree. And this should happen in the serial and parallel case.

I meant that there should definitely be a check between each round of
index scans (one index scan here affects each and every index). Doing
more than a single index scan is presumably rare, but are likely
common among VACUUM operations that take an unusually long time --
which is where the failsafe is relevant.

I'm just highlighting that multiple index scans (rather than just 0 or
1 index scans) is by far the primary risk factor that leads to a
VACUUM that takes way longer than is typical. (The other notable risk
comes from aggressive VACUUMs that freeze a great deal of heap pages
all at once, which I'm currently addressing by getting rid of the
whole concept of discrete aggressive mode VACUUM operations.)

> > This code effectively treats pages skipped using the visibility map as
> > equivalent to pages physically scanned (scanned_pages), even though
> > skipping requires essentially no work at all. That just isn't logical,
> > and feels like something worth fixing. The fundamental unit of work in
> > lazy_scan_heap() is a *scanned* heap page.
>
> It makes perfect sense to use the scanned_pages instead.

Want to have a go at writing a patch for that?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-11-10 23:22:44 Re: [PATCH] ALTER TABLE ... SET STORAGE default
Previous Message Jacob Champion 2022-11-10 21:07:42 Re: User functions for building SCRAM secrets