Re: Replication

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Fujii Masao <fujii(dot)masao(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication
Date: 2006-08-25 16:31:55
Message-ID: 1156523515.1347.30.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2006-08-25 at 11:23 +0200, Markus Schiltknecht wrote:
> Jeff Davis wrote:
> > Which doesn't work very well in the case of two groups of servers set up
> > in two physical locations. I can see two possibilities:
> > (1) You require a quorum to be effective, in which case your cluster of
> > databases is only as reliable as the location which holds more servers.
> > (2) You have another central authority that determines which databases
> > are up, and which are down. Then your cluster is only as reliable as
> > that central authority.
>
> Right, the ideal here would be two sync clusters a both locations,
> connected via async replication :-)
>
> > Even if you have a large number of nodes at different locations, then
> > you end up with strange decisions to make if the network connections are
> > intermittent or very slow. A temporary slowdown of many nodes could
> > cause them to be degraded until some kind of human intervention brought
> > them back. Until that time you might not be able to determine which
> > nodes make up an authoritative group.
>
> Side note: in such a case, I think a GCS will just choose only one node
> to be the 'authoritative group'. Because most systems cannot effort to
> have long waits for such decisions. For database replication I also
> think its better to have at least one node running than none.
>
> > This kind of degradation could
> > happen in the case of a DDoS attack, or perhaps a worm moving around the
> > internet.
>
> Well, sync replication in general needs a good, low latency and secure
> interconnect. The internet does not seem to be a good fit here.
>
> > In practice everyone can find a solution that works for them. However,
> > synchronous replication is not perfect, and there are many failure
> > scenarios which need to be resolved in a way that fits your business. I
> > think synchronous replication is inherently less available than
> > asynchronous.
>
> This surely depends on the environment. With a dedicated (i.e. low
> latency and secure) interconnect sync replication is surely more
> available because your arguments above don't apply. And because sync
> replication guarantees you won't loose committed transactions.
>
> If however you want or have to replicate over the internet it depends.
> Your arguments above also apply to async replication. Only that because
> of the conflict resolution, async replication systems can continue to
> operate on all the disconnected nodes and merge their work later on as
> the network is up again. But then again, async still has the danger of
> loosing transactions.
>
> So I probably agree: if you are on an unreliable network and if you have
> conflict resolution correctly setup then async replication is more
> available, but less secure.
>
> As I said above, sync replication needs a reliable interconnect, better
> even have two interconnects, because it's a SPOF for a clustered
> database system.
>

Ok, I agree with your statements. Async is convenient in many ways, but
has less durable transactions (at least for transactions committed
recently). Sync has some limitations, and is harder to get right (at
least if you want good availability as well) but provides more durable
transactions and consistency between systems.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Atukunda 2006-08-25 17:20:34 Re: [HACKERS] psql 'none' as a HISTFILE special case
Previous Message Tom Lane 2006-08-25 16:28:16 Re: [HACKERS] psql 'none' as a HISTFILE special case