Re: [Patch] ALTER SYSTEM READ ONLY

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Date: 2021-03-14 18:21:02
Message-ID: CALtqXTfqAnPk+bpbXk_LfH=N09nibhnt_CfkU3Nk1+Zyv4_uRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 9, 2021 at 3:31 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:

> On Thu, Mar 4, 2021 at 11:02 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 3, 2021 at 8:56 PM Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> > >
> > > On Tue, Mar 2, 2021 at 7:22 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com>
> wrote:
> > > > Why do we need to move promote related code in XLogAcceptWrites?
> > > > IMHO, this promote related handling should be in StartupXLOG only.
> > > > That will look cleaner.
> > >
> > > The key design question here, at least in my mind, is what exactly
> > > happens after prohibit-WAL + system-crash + recovery-finishes. We
> > > clearly can't write the checkpoint or end-of-recovery record and
> > > proceed with business as usual, but are we still in recovery? Either
> > > (1) we are technically still in recovery, stopping just short of
> > > entering normal running, and will emerge from recovery when WAL is
> > > permitted again; or (2) we have technically finished recovery, but
> > > deferred some of the actions that would normally occur at that time
> > > until a later point. Maybe this is academic distinction as much as
> > > anything, but the idea is if we choose #1 then we should do as little
> > > as possible at the point when recovery finishes and defer as much as
> > > possible until we actually enter normal running; whereas if we choose
> > > #2 we should do as much as possible at the point when recovery
> > > finishes and defer only those things which absolutely have to be
> > > deferred. That said, I and I think also Andres are voting for #2.
> > >
> > > But if we go that way, that precludes what you are proposing here. If
> > > we picked #1 then it would be natural for the startup process to
> > > remain active and the control file update to be postponed until WAL
> > > writes are re-enabled; but under model #2 we want, if possible, for
> > > the startup process to exit and the control file update to happen
> > > normally, and only the writing of the actual WAL records to be
> > > deferred.
> > >
> >
> > Current patch doing a mix of both, startup process exits without doing
> > WAL writes and control file updates, that happens later when system
> > changes to read-write.
> >
> > > What I find much odder, looking at the present patch, is that
> > > PerformPendingStartupOperations() gets called from pg_prohibit_wal()
> > > rather than by the checkpointer. If the checkpointer is the process
> > > that is in charge of coordinating the change between a read-only state
> > > and a read-write state, then it ought to also do this. I also think
> > > that the PerformPendingStartupOperations() wrapper is unnecessary.
> > > Just invert the sense of the XLogCtl flag: xlogAllowWritesDone, rather
> > > than startupCrashRecoveryPending, and have XLogAcceptWrites() set it
> > > (and return without doing anything if it's already set). Then the
> > > checkpointer can just call the function unconditionally whenever we go
> > > read-write, and for a bonus we will have much better naming
> > > consistency, rather than calling the same thing "xlog accept writes"
> > > in one place, "pending startup operations" in another, and "startup
> > > crash recovery pending" in a third.
> > >
> >
> > Ok, in the attached version, I have used the xlogAllowWritesDone
> variable.
> > To match the naming sense, it should be set to 'false' initially and
> > should get set to 'true' when the XLogAcceptWrites() operation completes.
> >
> > I have removed the PerformPendingStartupOperations() wrapper function
> and I have
> > slightly changed XLogAcceptWrites() to minimize its parameter count so
> that it
> > can use available global variable values instead of parameters.
> Unfortunately,
> > it cannot be called from checkpointer unconditionally, it will create a
> race
> > with startup process when startup process still in recovery and
> checkpointer
> > launches and see that xlogAllowWritesDone = false, will go-ahead for
> those wal
> > write operations and end-of-recovery checkpoint which will be a
> > disaster. Therefore, I moved this XLogAcceptWrites() function inside
> > ProcessWALProhibitStateChangeRequest() and called when the system is in
> > GOING_READ_WRITE transition state. Since
> ProcessWALProhibitStateChangeRequest()
> > gets called from a different places of checkpointer process which
> creates a
> > cascaded call to XLogAcceptWrites() function, to avoid that I am updating
> > xlogAllowWritesDone = true immediately after it gets checked in
> > XLogAcceptWrites() which I think is not the right approach, technically,
> it
> > should be updated at the end of XLogAcceptWrites().
> >
> > I think instead of xlogAllowWritesDone, we should use invert of it, as
> > the previous, e.g.xlogAllowWritesPending or xlogAllowWritesSkipped or
> something
> > else and that will be get explicitly set 'true' when we skip
> XLogAcceptWrites()
> > call. That will avoid the race of checkpointer process with the startup
> since
> > initially, it will be 'false', and if it is 'false' we will return
> immediately
> > from XLogAcceptWrites(). Also, we don't need to move XLogAcceptWrites()
> inside
> > ProcessWALProhibitStateChangeRequest(), it can be called from
> checkpointerMain()
> > loop which also avoids cascade calls and we don't need to update it
> until we
> > complete those write operations. Thoughts/Comments?
> >
> In the attached version, I am able to fix most of the concerns that I had.
> Right
> now, having the xlogAllowWritesDone variable is fine, and that will get
> updated
> at the end of the XLogAcceptWrites() function, unlike the previous.
> XLogAcceptWrites() will be called from
> ProcessWALProhibitStateChangeRequest()
> while the system state changes to read-write, like previous. Now to avoid
> the
> recursive call to ProcessWALProhibitStateChangeRequest() from the
> end-of-recovery checkpoint happening in XLogAcceptWrites(), I have added a
> private boolean state variable in walprohibit.c, using it wal prohibit
> state
> transition can be put on hold for some time; did the same while calling
> XLogAcceptWrites().
>
> Regards,
> Amul
>

One of the
patch (v18-0002-Error-or-Assert-before-START_CRIT_SECTION-for-WA.patch)
from the latest patchset does not apply successfully.

http://cfbot.cputube.org/patch_32_2602.log

=== applying patch
./v18-0002-Error-or-Assert-before-START_CRIT_SECTION-for-WA.patch

Hunk #15 succeeded at 2604 (offset -13 lines).
1 out of 15 hunks FAILED -- saving rejects to file
src/backend/access/nbtree/nbtpage.c.rej
patching file src/backend/access/spgist/spgdoinsert.c

It is a very minor change, so I rebased the patch. Please take a look, if
that works for you.

--
Ibrar Ahmed

Attachment Content-Type Size
v19-0002-Error-or-Assert-before-START_CRIT_SECTION-for-WA.patch application/octet-stream 69.2 KB
v19-0001-Implement-wal-prohibit-state-using-global-barrie.patch application/octet-stream 51.4 KB
v19-0003-WIP-Documentation.patch application/octet-stream 10.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2021-03-14 18:55:06 Re: Transactions involving multiple postgres foreign servers, take 2
Previous Message Fabien COELHO 2021-03-14 16:08:30 Re: pgbench - add pseudo-random permutation function