Quick Links

Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss"

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss"
Date:	2015-12-17 17:04:25
Message-ID:	CAMkU=1yWky3fFnJ8AYAdOCctQWrEF0RWhU8v9GOtFFpxkF3Myw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Sorry, accidentally failed to include the list originally, here it is
for the list:

On Dec 16, 2015 9:52 AM, "Robert Haas" <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Dec 11, 2015 at 1:08 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> > Since changes to datfrozenxid are WAL logged at the time they occur,
> > but the supposedly-synchronous change to ShmemVariableCache is not WAL
> > logged until the next checkpoint, a well timed crash can leave you in
> > the state where the system is in a tizzy about wraparound but each
> > database says "Nope, not me".
>
> ShmemVariableCache is an in-memory data structure, so it's going to
> get blown away and rebuilt on a crash. But I guess it gets rebuild
> from the contents of the most recent checkpoint record, so that
> doesn't actually help. However, I wonder if it would be safe to for
> the autovacuum launcher to calculate an updated value and call
> SetTransactionIdLimit() to update ShmemVariableCache.

I was wondering if that should happen either at the end of crash
recovery (but I suppose you can't poll pg_database yet at that
point?), or immediately before throwing the "database is not accepting
commands to avoid wraparound data loss" error.

At which point would it make sense for the launcher do it? I guess
just after it was started up under PMSIGNAL_START_AUTOVAC_LAUNCHER
conditions?

> But I'm somewhat confused what this has to do with Andres's report.

Doesn't it explain the exact situation he is in, where the oldest
database is 200 million, but the cluster as a whole is 2 billion?

Cheers,

Jeff

In response to

Re: Cluster "stuck" in "not accepting commands to avoid wraparound data loss" at 2015-12-16 17:52:37 from Robert Haas

Responses

Re: Fwd: Cluster "stuck" in "not accepting commands to avoid wraparound data loss" at 2015-12-17 17:14:15 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2015-12-17 17:07:38	Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Previous Message	Tom Lane	2015-12-17 16:19:28	Re: Using a single standalone-backend run in initdb (was Re: Bootstrap DATA is a pita)