Re: Straightforward Synchronous Replication

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Straightforward Synchronous Replication
Date: 2010-05-27 15:50:18
Message-ID: 1274975418.4405.84.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2010-05-27 at 10:11 -0400, Robert Haas wrote:
> On Thu, May 27, 2010 at 9:08 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > * New process: WALAck (on standby)
> > Reads shared memory to get last received and last applied xlog location
> > and sends message to WALSync on primary. Loop/Sleep forever.
>
> So would WALAck be polling shared memory? That would increase latency
> significantly, I think, though perhaps you have a plan for avoiding
> that?

The backends are going to be released in batches anyway, so I can't see
how polling makes a difference.

Polling means no waiting, so asynchronous action and higher throughput,
and with sufficiently high polling rate no significant loss of latency.

The other plan requires WALReceiver to wait for fsync and apply, which
seems very likely to suck badly from a latency perspective. While its
waiting it is also reducing throughout of incoming WAL. It's hard to see
how that would work well.

You could also do this by avoiding the wait in WALReceiver, but then
that becomes more like polling anyway.

> > The above needs just two parameters at user level
> > synch_rep = none | recv | apply
> > synch_rep_timeout = Ns
> > and an additional parameter in recovery.conf to say whether a standby is
> > providing the facility for sync replication (as requested by Yeb etc)
> > (default = yes).
> >
> > So this is the same as having quorum = 0 or 1 (boring but simple) and
> > having sync_rep_timeout_action = commit in all cases (clear behaviour in
> > failure modes, without need for per-standby parameters).
>
> This seems good, but I think we need a little more definition about
> what happens with sync_rep_timeout expires.

It commits... that is very clear: "sync_rep_timeout_action = commit in
all cases". Commit is the only viable option, since abort and
wait-forever both have disadvantages pointed out for them.

> > Yes, this is a 3rd design for sync rep, though I think it improves upon
> > the things I've heard so far from other authors and also includes
> > feedback from Dimitri, Heikki, Yeb, Alastair. I'm happy to code this as
> > well, when 9.1 dev starts and a benchmark should be interesting also.
>
> It's great that we have so many people who want to implement this
> feature, or in one case already have. I'm not sure whose design is
> best, but I do hope that we can avoid dueling patches. There are
> plenty of other good features to work on also.

There is already a patch on SR, yet Masao is discussing another that
contains what looks to me like very close to nothing of Zoltan's work,
not even similar ideas. The dueling patches situation looks like it
already exists to me, though not of my making or encouragement. Even if
I agreed with everything one of those authors say, there would still be
two patches.

Considering a variety of design approaches seems like a good idea for an
important feature, especially when the information is thin and opinions
run high. It's unlikely that anyone is right about everything, which is
why I've amalgamated this simple proposal from everything said so far.

It's easy to add some things if we add them at the start, much harder to
retrofit them. I've shown that some things are easier than has been
said, with fewer parameters and a good case for better performance also.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2010-05-27 15:51:24 Re: pg_trgm
Previous Message Tatsuo Ishii 2010-05-27 15:46:19 Re: pg_trgm