Re: PostgreSQL Documentation of High Availability and Load Balancing

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: José Orlando Pereira <jop(at)di(dot)uminho(dot)pt>
Cc: community(at)gorda(dot)di(dot)uminho(dot)pt, PostgreSQL-documentation <pgsql-docs(at)postgresql(dot)org>, bruce(at)momjian(dot)us, pgsql-docs(at)postgresql(dot)org
Subject: Re: PostgreSQL Documentation of High Availability and Load Balancing
Date: 2006-11-21 11:35:45
Message-ID: 4562E491.8070600@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hi,

José Orlando Pereira wrote:
>> Hm, what's wrong with that? Okay, we should better not mention Oracle
>> RAC there, but it is a product doing 'Multi-Master Replication Using
>> Clustering', isn't it?
>
> AFAIK, RAC uses a shared disk, thus it does not provide replication.

Oh, that's right. Hm... thus there are no such things as Multi-Master
Replication for shared-disk or shared-memory machines, because that's
not replication. My fault, sorry.

> And I
> don't think RAC can be emulated at all at the application level with 2PC.

No, that would not make sense. The paragraph is about Multi Master
Replication, which I thought Oracle RAC would be in. But I agree that
Oracle RAC should not be considered replication at all.

What do you think about sharing disks by the means of network file
systems, like OCFS2? I was under the impression that Oracle built that
one to run RAC on top of it. That combination would run on a shared
nothing cluster, but does that make it replication?

According to you, what category does Oracle RAC (and PGCluster-II)
belong to? Shared Disk Clusters?

> Classifying replication protocols is indeed a hard problem. Besides my issues
> with the multi-master replication using clustering category, I miss a
> reference to multi-master asynchronous replication (and thus, to
> reconciliation), which is a big issue in Oracle, MS SQL, etc literature.

Yeah, I'm missing that, too.

Well, the docu talks about async and sync, but IMO, it's somewhat sloppy
in that it only covers one aspect of synchronous replication (namely
that a failover will not loose data).

The other statement, that 'servers will return consistent results with
no propagation delay' is somewhat uncorrect, as there certainly is a
delay of propagation before the commit. And in that the individual
databases are very well consistent, just not synchronous.

Emmanuel Cecchet listed some questions one might use to categorize or
further specify aspects of synchronous replication in [1].

The current paragraph doesn't even clearly state that it's talking about
synchronous replication. Maybe we want to have only one paragraph for
Multi-Master replication and cover sync as well as async there?

> Coming from a fault-tolerant distributed systems background, we'd call
> that "replicated state machine" or "active replication". I don't think
> however that using those names in this context would be helpful.

Wikipedia has a definition of replicated state machine in [2]. I'm not
keen to use that term.

>> Thank you for your suggestions. And I'm glad you're seeing PostgreSQL
>> that way. But I think your additions don't quite fit into the
>> documentation because they are too promotional.
>
> Hey, you can't blame me for trying... ;)

No, it's more that I'm sorry for not having explained better what we need.

> Ok, I understand your motivations. I agree with the listing replication
> solutions somewhere on the website. I'd still add the research and innovation
> bullet,

Yes, pointing to that surely won't hurt.

> instead of trying to squeeze group-communication based stuff in
> existing bullets.

I see 2PC, shared memory and locking and using a GCS as implementation
details of sync, multi-master replication. I'd even put statement-based
replication in there, but one can reasonably argue about that. Anyway,
if at all, those should only be quickly mentioned as possible
implementations. But I don't think it helps to go that far. Having a
good description of sync MM and async MM replication is certainly
sufficient there.

Again, thank you very much for your inputs.

Regards

Markus

[1]: Emmanuel Cecchet:
https://forge.continuent.org/pipermail/sequoia/2006-November/004070.html

[2]: Wikipedia definition of replicated state machine:
http://en.wikipedia.org/wiki/State_machine_replication

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Markus Schiltknecht 2006-11-21 11:57:00 Re: [Sequoia] PostgreSQL Documentation of High Availability and Load
Previous Message Markus Schiltknecht 2006-11-21 10:08:19 Re: [Sequoia] PostgreSQL Documentation of High Availability and Load