Re: PostgreSQL Documentation of High Availability and

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: José Orlando Pereira <jop(at)di(dot)uminho(dot)pt>, community(at)linux(dot)di(dot)uminho(dot)pt, PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>
Subject: Re: PostgreSQL Documentation of High Availability and
Date: 2006-11-21 18:12:52
Message-ID: 200611211812.kALICqr17485@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Markus Schiltknecht wrote:
> Hi,
>
> Jos? Orlando Pereira wrote:
> >> Hm, what's wrong with that? Okay, we should better not mention Oracle
> >> RAC there, but it is a product doing 'Multi-Master Replication Using
> >> Clustering', isn't it?
> >
> > AFAIK, RAC uses a shared disk, thus it does not provide replication.
>
> Oh, that's right. Hm... thus there are no such things as Multi-Master
> Replication for shared-disk or shared-memory machines, because that's
> not replication. My fault, sorry.

OK, title now is "Multi-Master Clustering".

> > And I
> > don't think RAC can be emulated at all at the application level with 2PC.
>
> No, that would not make sense. The paragraph is about Multi Master
> Replication, which I thought Oracle RAC would be in. But I agree that
> Oracle RAC should not be considered replication at all.
>
> What do you think about sharing disks by the means of network file
> systems, like OCFS2? I was under the impression that Oracle built that
> one to run RAC on top of it. That combination would run on a shared
> nothing cluster, but does that make it replication?
>
> According to you, what category does Oracle RAC (and PGCluster-II)
> belong to? Shared Disk Clusters?
>
> > Classifying replication protocols is indeed a hard problem. Besides my issues
> > with the multi-master replication using clustering category, I miss a
> > reference to multi-master asynchronous replication (and thus, to
> > reconciliation), which is a big issue in Oracle, MS SQL, etc literature.
>
> Yeah, I'm missing that, too.

I added async multi-master:

<varlistentry>
<term>Multi-Master With Conflict Resolution</term>
<listitem>

<para>
For servers that are not regularly connected, like laptops or
remote servers, keeping data consistent among servers is a
challenge. One simple solution is to allow each server to
modify the data, and have periodic communication compare
databases and ask users to resolve any conflicts.
</para>
</listitem>
</varlistentry>

> Well, the docu talks about async and sync, but IMO, it's somewhat sloppy
> in that it only covers one aspect of synchronous replication (namely
> that a failover will not loose data).
>
> The other statement, that 'servers will return consistent results with
> no propagation delay' is somewhat uncorrect, as there certainly is a
> delay of propagation before the commit. And in that the individual
> databases are very well consistent, just not synchronous.

OK, updated to add "little" delay, and removed "small" from async case:

load-balanced servers will return consistent results with little
propagation delay. Asynchronous updating has a delay between the

>
> Emmanuel Cecchet listed some questions one might use to categorize or
> further specify aspects of synchronous replication in [1].
>
> The current paragraph doesn't even clearly state that it's talking about
> synchronous replication. Maybe we want to have only one paragraph for
> Multi-Master replication and cover sync as well as async there?

Does the new conflict resolution section help that?

> > Coming from a fault-tolerant distributed systems background, we'd call
> > that "replicated state machine" or "active replication". I don't think
> > however that using those names in this context would be helpful.
>
> Wikipedia has a definition of replicated state machine in [2]. I'm not
> keen to use that term.
>
> >> Thank you for your suggestions. And I'm glad you're seeing PostgreSQL
> >> that way. But I think your additions don't quite fit into the
> >> documentation because they are too promotional.
> >
> > Hey, you can't blame me for trying... ;)
>
> No, it's more that I'm sorry for not having explained better what we need.

I was originally worried no one commented on my initial version of this
chapter. I am not worried any more. ;-) Actually, I think we all
understand 60% of this topic, but a different 60%, so when we are done,
it will cover 100%.

> > Ok, I understand your motivations. I agree with the listing replication
> > solutions somewhere on the website. I'd still add the research and innovation
> > bullet,
>
> Yes, pointing to that surely won't hurt.
>
> > instead of trying to squeeze group-communication based stuff in
> > existing bullets.
>
> I see 2PC, shared memory and locking and using a GCS as implementation
> details of sync, multi-master replication. I'd even put statement-based
> replication in there, but one can reasonably argue about that. Anyway,
> if at all, those should only be quickly mentioned as possible
> implementations. But I don't think it helps to go that far. Having a
> good description of sync MM and async MM replication is certainly
> sufficient there.
>
> [1]: Emmanuel Cecchet:
> https://forge.continuent.org/pipermail/sequoia/2006-November/004070.html

Ah, good read. I didn't realize the shared disk aspect of Oracle RAC,
and have removed mention of RAC from our documentation. Oracle RAC
seems like an interesting hybrid solution. They use shared disk so they
don't have to send the data to all the nodes, but send cache
invalidation information to all nodes so they know when something has
changed. I have added the Oracle RAC details as an SGML comment in case
we ever need to mention it.

As far as going into the other details of what features each replication
solution has, e.g. adding nodes, etc, it is beyond the scope of this
chapter, though perhaps some of the items are appropriate.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2006-11-21 18:30:36 Re: [HACKERS] Replication documentation addition
Previous Message Markus Schiltknecht 2006-11-21 15:50:06 Re: "Clustering"