Re: Testing autovacuum wraparound (including failsafe)

From: Ian Lawrence Barwick <barwick(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Testing autovacuum wraparound (including failsafe)
Date: 2022-11-16 04:38:10
Message-ID: CAB8KJ=j1b3kscX8Cg5G=Q39ZQsv2x4URXsuTueJLz=fcvJ3eoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2022年6月30日(木) 10:40 Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>:
>
> Hi,
>
> On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > Hi,
> > >
> > > On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:
> > > > Cool. Thank you for working on that!
> > > > Could you please share a WIP patch for the $subj? I'd be happy to help with
> > > > it.
> > >
> > > I've attached the current WIP state, which hasn't evolved much since
> > > this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
> > > but I'm not sure that's the best place. But I didn't think
> > > src/test/recovery is great either.
> > >
> >
> > Thank you for sharing the WIP patch.
> >
> > Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
> > for zeroing out all pages), how about using single-user mode instead
> > of preparing the transaction? That is, after pg_resetwal we check the
> > ages of datfrozenxid by executing a query in single-user mode. That
> > way, we don’t need to worry about autovacuum concurrently running
> > while checking the ages of frozenxids. I’ve attached a PoC patch that
> > does the scenario like:
> >
> > 1. start cluster with autovacuum=off and create tables with a few data
> > and make garbage on them
> > 2. stop cluster and do pg_resetwal
> > 3. start cluster in single-user mode
> > 4. check age(datfrozenxid)
> > 5. stop cluster
> > 6. start cluster and wait for autovacuums to increase template0,
> > template1, and postgres datfrozenxids
>
> The above steps are wrong.
>
> I think we can expose a function in an extension used only by this
> test in order to set nextXid to a future value with zeroing out
> clog/subtrans pages. We don't need to fill all clog/subtrans pages
> between oldestActiveXID and nextXid. I've attached a PoC patch for
> adding this regression test and am going to register it to the next
> CF.
>
> BTW, while testing the emergency situation, I found there is a race
> condition where anti-wraparound vacuum isn't invoked with the settings
> autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker
> sends a signal to the postmaster after advancing datfrozenxid in
> SetTransactionIdLimit(). But with the settings, if the autovacuum
> launcher attempts to launch a worker before the autovacuum worker who
> has signaled to the postmaster finishes, the launcher exits without
> launching a worker due to no free workers. The new launcher won’t be
> launched until new XID is generated (and only when new XID % 65536 ==
> 0). Although autovacuum_max_workers = 1 is not mandatory for this
> test, it's easier to verify the order of operations.

Hi

Thanks for the patch. While reviewing the patch backlog, we have determined that
the latest version of this patch was submitted before meson support was
implemented, so it should have a "meson.build" file added for consideration for
inclusion in PostgreSQL 16.

Regards

Ian Barwick

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ian Lawrence Barwick 2022-11-16 04:43:51 Re: logical decoding and replication of sequences, take 2
Previous Message Simon Riggs 2022-11-16 04:33:08 Re: Hash index build performance tweak from sorting