Re: Sync Rep: First Thoughts on Code

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, robertmhaas(at)gmail(dot)com, pgsql(at)j-davis(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, markus(at)bluegap(dot)ch, masao(dot)fujii(at)gmail(dot)com, aidan(at)highrise(dot)ca, heikki(dot)linnakangas(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sync Rep: First Thoughts on Code
Date: 2008-12-14 17:57:24
Message-ID: 49454904.9050206@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> I am truly lost to understand why the *name* "synchronous replication"
> causes so much discussion, yet nobody has discussed what they would
> actually like the software to *do* (this being a software discussion
> list...). AFAICS we can make the software behave like *any* of the
> definitions discussed so far.
>

I think people have talked about 'like' in the context of user
expectations. That is, there seems to exist a set of people (probably
those who've never worked with a multi-replica solution before) who
expect that once commit completes on one server, they can query any
other master or slave and be guaranteed visibility of the transaction
they just committed. These people may theoretically change their
decision to not use Postgres-R, or at least change their approach to how
they work with Postgres-R, if the name was in some way more intuitive to
them in terms of what is actually being provided.

"Synchronous replication" itself says only details about replication, it
does not say anything about visibility, so to some degree, people are
focusing on the wrong term as the problem. Even if it says "asynchronous
replication" - not sure that I care either way - this doesn't improve
the understanding for the casual user of what is happening behind the
scenes. Neither synchronous nor asynchronous guarantees that the change
will be immediately visible from other nodes after I type 'commit;'.
Asynchronous might err on the side of not immediately visible, where
synchronous might (incorrectly) imply immediate visibility, but it's not
an accurate guarantee to provide.

Synchronous does not guarantee visibility immediately after. Some
indefinite but usually short time must normally pass from when my
'commit;' completes until when the shared memory visible to my process
"sees" the transaction. Multiple replicas with network latency or
reliability issues increases the theoretical minimum size of this window
to something that would be normally encountered as opposed to something
that is normally not encountered.

The only way to guarantee visibility is to ensure that the new
transaction is guaranteed to be visible from a shared memory perspective
on every machine in the pool, and every active backend process. If my
'commit;' is going to wait for this to occur, first, I think this forces
every commit to have numerous network round trips to each machine in the
pool, it forces each machine in the pool to be network accessible and
responsive, it forces all commits to be serialized in the sense of "the
slowest machine in the pool determines the time for my commit to
complete", and I think it implies some sort of inter-process signalling,
or at the very least CPU level signalling about shared memory (in the
case of multiple CPUs).

People such as myself think that a visibility guarantee is unreasonable
and certain to cause scalability or reliability problems. So, my 'like'
is an efficient multi-master solution where if I put 10 machines in the
pool, I expect my normal query/commit loads to approach 10X as fast. My
like prefers scalability over guarantees that may be difficult to
provide, and probably are not provided today even in a single server
scenario.

> It is certainly far too early to say what the final exact behaviour will
> be and there is no reason at all to pre-suppose that it need only be a
> single behaviour. I'm in favour of options, generally, but I would say
> that the distinction between some of these options is mostly very fine
> and strongly doubt whether people would use them if they existed. *But*
> I think we can add them at a later stage of development if requirements
> genuinely exist once all the benefits *and* costs are understood.
>

The above 'commit;' behaviour difference - whether it completes when the
commit is permanent (it definitely will be applied for certain to all
replicas - it just may take time to apply to all replicas), or when the
commit has actually taken effect (two-phase commit on all replicas - and
both phases have completed on all replicas - what happens if second
phase commit fails on one or more servers?), or when the commit is
guaranteed to be visible from all existing and new sessionss (two-phase
commit plus additional signalling required?) might be such an option.

I'm doubtful, though - as the difference in implementation between the
first and second is pretty significant.

I'm curious about your suggestion to direct queries that need the latest
snapshot to the 'primary'. I might have misunderstood it - but it seems
that the expectation from some is that *all* sessions see the latest
snapshot, so would this not imply that all sessions would be redirect to
the 'primary'? I don't think it is reasonable myself, but I might be
misunderstanding something...

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2008-12-14 18:06:57 Re: Sync Rep: First Thoughts on Code
Previous Message Andrew Dunstan 2008-12-14 17:13:48 Re: parallel restore vs. windows