Re: Synchronous Log Shipping Replication

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Markus Wanner <markus(at)bluegap(dot)ch>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Synchronous Log Shipping Replication
Date: 2008-09-10 08:25:07
Message-ID: 1221035107.3913.591.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Wed, 2008-09-10 at 11:10 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Wed, 2008-09-10 at 13:28 +0900, Fujii Masao wrote:
> >> On Tue, Sep 9, 2008 at 8:38 PM, Heikki Linnakangas
> >> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> >>> There's one thing I haven't figured out in this discussion. Does the write
> >>> to the disk happen before or after the write to the slave? Can you guarantee
> >>> that if a transaction is committed in the master, it's also committed in the
> >>> slave, or vice versa?
> >
> > The write happens concurrently and independently on both.
> >
> > Yes, you wait for the write *and* send pointer to be "flushed" before
> > you allow a synch commit with synch replication. (Definition of flushed
> > is changeable by parameters).
>
> The thing that bothers me is the behavior when the synchronous slave
> doesn't respond. A timeout has been discussed, after which the master
> just gives up on sending, and starts acting as if there's no slave.
> How's that different from asynchronous mode where WAL is sent to the
> server concurrently when it's flushed to disk, but we don't wait for the
> send to finish? ISTM that in both cases the only guarantee we can give
> is that when a transaction is acknowledged as committed, it's committed
> in the master but not necessarily in the slave.

We should differentiate between what the WALsender does and what the
user does in response to a network timeout.

Saying "I want to wait for a synchronous commit and I am willing to wait
for ever to ensure it" leads to long hangs in some cases.

I was suggesting that some users may wish to wait up to time X before
responding to the commit. The WALSender may keep retrying long after
that point, but that doesn't mean all current users need to do that
also. The user would need to say whether the response to the timeout was
an error, or just accept and get on with it.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-09-10 08:29:14 Re: WIP patch: Collation support
Previous Message Hannu Krosing 2008-09-10 08:24:01 Re: Synchronous Log Shipping Replication