Re: pgsql: Fix various possible problems with synchronous replication.

From: Thom Brown <thom(at)linux(dot)com>
To: Robert Haas <rhaas(at)postgresql(dot)org>
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Fix various possible problems with synchronous replication.
Date: 2011-03-17 17:24:34
Message-ID: AANLkTimv-SW2rDsmoo=hMLejh2PE9dNf7jaxyhNgp__3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 17 March 2011 17:12, Robert Haas <rhaas(at)postgresql(dot)org> wrote:
> Fix various possible problems with synchronous replication.
>
> 1. Don't ignore query cancel interrupts.  Instead, if the user asks to
> cancel the query after we've already committed it, but before it's on
> the standby, just emit a warning and let the COMMIT finish.
>
> 2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown).
> Instead, emit a warning message and close the connection without
> acknowledging the commit.  Other backends will still see the effect of
> the commit, but there's no getting around that; it's too late to abort
> at this point, and ignoring die interrupts altogether doesn't seem like
> a good idea.
>
> 3. If synchronous_standby_names becomes empty, wake up all backends
> waiting for synchronous replication to complete.  Without this, someone
> attempting to shut synchronous replication off could easily wedge the
> entire system instead.
>
> 4. Avoid depending on the assumption that if a walsender updates
> MyProc->syncRepState, we'll see the change even if we read it without
> holding the lock.  The window for this appears to be quite narrow (and
> probably doesn't exist at all on machines with strong memory ordering)
> but protecting against it is practically free, so do that.
>
> 5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and
> doesn't actually do anything.
>
> There's still some further work needed here to make the behavior of fast
> shutdown plausible, but that looks complex, so I'm leaving it for a
> separate commit.  Review by Fujii Masao.
>
> Branch
> ------
> master
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/9a56dc3389b9470031e9ef8e45c95a680982e01a
>
> Modified Files
> --------------
> doc/src/sgml/config.sgml            |    3 +-
> src/backend/postmaster/walwriter.c  |    6 +
> src/backend/replication/syncrep.c   |  302 ++++++++++++++++++++++-------------
> src/backend/tcop/postgres.c         |    6 +
> src/include/replication/syncrep.h   |    4 +-
> src/include/replication/walsender.h |    7 +
> 6 files changed, 214 insertions(+), 114 deletions(-)

errmsg("canceling the wait for replication and terminating connection
due to administrator command")
errmsg("canceling wait for synchronous replication due to user request")

Should that first one then also say "synchronous replication"?

errdetail("The transaction has already been committed locally but
might have not been replicated to the standby.")));
errdetail("The transaction has committed locally, but may not have
replicated to the standby.")));

Could we have these saying precisely the same thing?

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2011-03-17 17:46:37 Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Robert Haas 2011-03-17 17:12:31 pgsql: Fix various possible problems with synchronous replication.

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-17 17:46:37 Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Robert Haas 2011-03-17 17:12:31 pgsql: Fix various possible problems with synchronous replication.