Re: Sync Rep and shutdown Re: Sync Rep v19

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Yeb Havinga <yebhavinga(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep and shutdown Re: Sync Rep v19
Date: 2011-03-16 02:07:04
Message-ID: AANLkTi=Rmkx42W3xjEP=65QSnGZnjLLezRk=JoTtnkJE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 9, 2011 at 11:11 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> Same as above. I think that it's more problematic to leave the code
> as it is. Because smart/fast shutdown can make the server get stuck
> until immediate shutdown is requested.

I agree that the current state of affairs is a problem. However,
after looking through the code somewhat carefully, it looks a bit
difficult to fix. Suppose that backend A is waiting for sync rep. A
fast shutdown is performed. Right now, backend A shrugs its shoulders
and does nothing. Not good. But suppose we change it so that backend
A closes the connection and exits without either confirming the commit
or throwing ERROR/FATAL. That seems like correct behavior, since, if
we weren't using sync rep, the client would have to interpret that as
indicating that the connection denied in mid-COMMIT, and mustn't
assume anything about the state of the transaction. So far so good.

The problem is that there may be another backend B waiting on a lock
held by A. If backend A exits cleanly (without a PANIC), it will
remove itself from the ProcArray and release locks. That wakes up A,
which can now go do its thing. If the operating system is a bit on
the slow side delivering the signal to B, then the client to which B
is connected might manage to see a database state that shows the
transaction previous running in A as committed, even though that
transaction wasn't committed. That would stink, because the whole
point of having A hold onto locks until the standby ack'd the commit
was that no other transaction would see it as committed until it was
replicated.

This is a pretty unlikely race condition in practice but people who
are running sync rep are intending precisely to guard against unlikely
failure scenarios.

The only idea I have for allowing fast shutdown to still be fast, even
when sync rep is involved, is to shut down the system in two phases.
The postmaster would need to stop accepting new connections, and first
kill off all the backends that aren't waiting for sync rep. Then,
once all remaining backends are waiting for sync rep, we can have them
proceed as above: close the connection without acking the commit or
throwing ERROR/FATAL, and exit. That's pretty complicated, especially
given the rule that the postmaster mustn't touch shared memory, but I
don't see any alternative. We could just not allow fast shutdown, as
now, but I think that's worse.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2011-03-16 02:22:59 Re: On-the-fly index tuple deletion vs. hot_standby
Previous Message Tom Lane 2011-03-16 02:00:14 Re: Flex output missing from 9.1a4 tarballs?