Re: Replication

From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Replication
Date: 2006-08-24 14:49:00
Message-ID: 60wt8y195v.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pgsql(at)j-davis(dot)com (Jeff Davis) writes:
> On Wed, 2006-08-23 at 13:36 +0200, Markus Schiltknecht wrote:
>> Hannu Krosing wrote:
>> > But if you have very few writes, then there seems no reason to do sync
>> > anyway.
>>
>> I think there is one: high-availability. A standby-server which can
>> continue if your primary fails. Of course sync is only needed if you
>> absolutely cannot effort loosing any committed transaction.
>>
>
> I disagree about high-availability. In fact, I would say that sync
> replication is trading availability and performance for synchronization
> (which is a valid tradeoff, but costly).
>
> If you have an async system, all nodes must go down for the system to go
> down.
>
> If you have a sync system, if any node goes down the system goes down.
> If you plan on doing failover, consider this: what if it's not obvious
> which system is still up? What if the network route between the two
> systems goes down (or just becomes too slow to replicate over), but
> clients can still connect to both servers? Then you have two systems
> that both think that the other system went down, and both start
> accepting transactions. Now you no longer have replication at all.

That is why for multimaster, there's a need for both automatic policy
as well as some human intervention.

- You need an automatic determination of "quorum", where, to be safe,
it is only permissible for a set of $m$ servers to believe themselves
to be active if they number more than 1/2 of the total of expected
servers.

Thus, if there are 13 servers in the cluster, then "quorum" is 7
servers.

If a set of 6 servers get cut off from the rest of the network, they
don't number at least 7, and thus know that they can't represent a
quorum.

- And if conditions change, a human may need to change the quorum
number.

If 4 new nodes get added, quorum moves up to 9.

If 5 nodes get dropped, quorum moves down to 5.

Deciding when to throw a node out of the quorum because it is
responding too slowly is still not completely trivial, but having a
quorum policy does address your issue.
--
let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/linux.html
"Be humble. A lot happened before you were born." - Life's Little
Instruction Book

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bernd Helmle 2006-08-24 14:51:42 Re: Updatable views
Previous Message Tom Lane 2006-08-24 14:01:05 Re: invalid byte sequence ?