Re: Sync Rep and shutdown Re: Sync Rep v19

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Yeb Havinga <yebhavinga(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep and shutdown Re: Sync Rep v19
Date: 2011-03-17 06:08:51
Message-ID: AANLkTim5iu+TCnc8PRtWT_f4A61GMkJAk=rPuHAz+Ys-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 17, 2011 at 2:35 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> 1. If a die interrupt is received (pg_terminate_backend or fast
> shutdown), then terminate the sync rep wait and arrange for the
> connection to be closed without acknowledging the commit (but do send
> a warning message back).  The commit still happened, though, so other
> transactions will see its effects.  This is unavoidable unless we're
> willing to either ignore attempts to terminate a backend waiting for
> sync rep, or panic the system when it happens, and I don't think
> either of those is appropriate.

OK.

> 2. If a query cancel interrupt is received (pg_cancel_backend or ^C),
> then cancel the sync rep wait and issue a warning before acknowledging
> the commit.  Again, the alternative is to either ignore the cancel or
> panic, neither of which I believe to be what users will want.

OK.

> 3. If synchronous_standby_names is changed to '' by editing
> postgresql.conf and issuing pg_ctl reload, then cancel all waits in
> progress and wake everybody up.  As I mentioned before, reloading the
> config file from within the waiting backend (which can't safely throw
> an error) seems risky,

AFAIR, ProcessConfigFile() doesn't throw an error. So I don't think
that's so risky. But, as you said in another thread, reading config file
at that point is inconsistent, I agree. And it seems better to leave
background process to wake up backends.

> so what I did instead is made WAL writer
> responsible for handling this.  Nobody's allowed to wait for sync rep
> unless a global shared memory flag is set, and the WAL writer process
> is responsible for setting and clearing this flag when the config file
> is reloaded.  This has basically no performance cost; WAL writer only
> ever does any extra work at all with this code when it receives a
> SIGHUP, and even then the work is trivial except in the case where
> synchronous_standby_names has changed from empty to non-empty or visca
> versa.  The advantage of putting this in WAL writer rather than, say,
> bgwriter is that WAL writer doesn't have nearly as many jobs to do and
> they don't involve nearly as much I/O, so the chances of a long delay
> due to the process being busy are much less.

This occurs to me; we should ensure that, in shutdown case, walwriter
should exit after all the backends have gone out? I'm not sure if it's worth
thinking of the case, but what if synchronous_standby_names is unset
and config file is reloaded after smart shutdown is requested? In this
case, the reload cannot wake up the waiting backends since walwriter
has already exited. This behavior looks a bit inconsistent.

> 4. Remove the SYNC_REP_MUST_DISCONNECT state, which actually does
> absolutely nothing right now, despite what the name would seem to
> imply.  In particular, it doesn't arrange for any sort of disconnect.
> This patch does arrange for that, but not using this mechanism.

OK.

> 5. The existing code relies on being able to read MyProc->syncRepState
> without holding the lock, even while a WAL sender must be updating it
> in another process.  I'm not 100% sure this is safe on a
> multi-processor machine with weak memory ordering.  In practice, the
> chances of something going wrong here seem extremely small.  You'd
> need something like this: a WAL sender updates MyProc->syncRepState
> just after the wait timeout expires and before the latch is reset, but
> the regular backend fails to see the state change due to
> memory-ordering effects and drops through the loop, waiting another 60
> s, and then finally wakes up and completes the wait (but a minute
> later than expected).  That seems vanishingly unlikely but it's also
> simple to protect against, so I did.

In the patch, in order to read the latest value, you take a light-weight lock.
But I wonder why taking a lock can ensure that the value is up-to-date.

> Review appreciated.

Thanks! Here are some comments:

+ * WAL writer calls this as needed to update the shared sync_standbys_needed

Typo: s/sync_standbys_needed/sync_standbys_defined

+ * we exit normally, or SYNC_REP_MUST_DISCONNECT in abnormal cases.

Typo: the reference to SYNC_REP_MUST_DISCONNECT is not required.

+ * So in this case we issue a NOTICE (which some clients may

Typo: s/NOTICE/WARNING

+ if (ProcDiePending)
+ {
+ ereport(WARNING,
+ (errcode(ERRCODE_ADMIN_SHUTDOWN),
+ errmsg("canceling the wait for replication and terminating
connection due to administrator command"),
+ errdetail("The transaction has already been committed locally
but might have not been replicated to the standby.")));
+ whereToSendOutput = DestNone;
+ LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
+ MyProc->syncRepState = SYNC_REP_WAIT_COMPLETE;
+ SHMQueueDelete(&(MyProc->syncRepLinks));

SHMQueueIsDetached() should be checked before calling SHMQueueDelete().
Walsender can already delete the backend from the queue before reaching here.

+ if (QueryCancelPending)
+ {
+ QueryCancelPending = false;
+ ereport(WARNING,
+ (errmsg("canceling wait for synchronous replication due to user request"),
+ errdetail("The transaction has committed locally, but may not
have replicated to the standby.")));
+ LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
+ MyProc->syncRepState = SYNC_REP_WAIT_COMPLETE;
+ SHMQueueDelete(&(MyProc->syncRepLinks));
+ LWLockRelease(SyncRepLock);

Same as above.

+ if (!PostmasterIsAlive(true))
+ {
+ whereToSendOutput = DestNone;
+ proc_exit(1);

proc_exit() should not be called at that point because it leads PANIC.

I think that it's better to check ProcDiePending, QueryCancelPending
and PostmasterIsAlive *before* waiting on the latch, not after. Because
those events can occur before reaching there, and it's not worth waiting
for 60 seconds to detect them.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-03-17 06:47:18 Re: Re: [COMMITTERS] pgsql: Basic Recovery Control functions for use in Hot Standby. Pause,
Previous Message Piyush Newe 2011-03-17 05:30:06 Re: Rectifying wrong Date outputs