Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From: Andres Freund <andres(at)anarazel(dot)de>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Date: 2023-04-05 21:15:34
Message-ID: 20230405211534.4skgskbilnxqrmxg@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-04-05 16:17:20 -0400, Melanie Plageman wrote:
> On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Perhaps the best solution for autovac vs interactive vacuum issue would be to
> > move the allocation of the bstrategy to ExecVacuum()?
>
> So, I started looking into allocating the bstrategy in ExecVacuum().
>
> While doing so, I was trying to understand if the "sanity checking" in
> vacuum() could possibly apply to autovacuum, and I don't really see how.
>
> AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
> VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.
>
> We could move those sanity checks up into ExecVacuum().

Would make sense.

ISTM that eventually most of what currently happens in vacuum() should be in
ExecVacuum(). There's a lot of stuff that can't happen for autovacuum. So it
just seems to make more sense to move those parts to ExecVacuum().

> I also noticed that we make the vac_context in vacuum() which says it is
> for "cross-transaction storage". We use it for the buffer access
> strategy and for the newrels relation list created in vacuum(). Then we
> delete it at the end of vacuum().

> Autovacuum workers already make a similar kind of memory context called
> AutovacMemCxt in do_autovacuum() which the comment says is for the list
> of relations to vacuum/analyze across transactions.

AutovacMemCxt seems to be a bit longer lived / cover more than the context
created in vacuum(). It's where all the hash tables etc live that
do_autovacuum() uses to determine what to vacuum.

Note that do_autovacuum() also creates:

/*
* create a memory context to act as fake PortalContext, so that the
* contexts created in the vacuum code are cleaned up for each table.
*/
PortalContext = AllocSetContextCreate(AutovacMemCxt,
"Autovacuum Portal",
ALLOCSET_DEFAULT_SIZES);

which is then what vacuum() creates the "Vacuum" context in.

> What if we made ExecVacuum() make its own memory context and both it and
> do_autovacuum() pass that memory context (along with the buffer access
> strategy they make) to vacuum(), which then uses the memory context in
> the same way it does now?

Maybe? It's not clear to me why it'd be a win.

> It simplifies the buffer usage limit patchset and also seems a bit more
> clear than what is there now?

I don't really see what it'd make simpler? The context in vacuum() is used for
just that vacuum - we couldn't just use AutovacMemCxt, as that'd live much
longer (for all the tables a autovac worker processes).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-04-05 21:23:31 Re: Negative cache entries for memoize
Previous Message Tom Lane 2023-04-05 21:11:11 Re: Using each rel as both outer and inner for JOIN_ANTI