replication docs: split single vs. multi-master

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: pgsql-patches(at)postgresql(dot)org
Subject: replication docs: split single vs. multi-master
Date: 2006-11-15 10:43:27
Message-ID: 455AEF4F.3010304@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hi,

as promised on -docs, here comes my proposal on how to improve the
replication documentation. The patches are split as follows and have to
be applied in order:

replication_doku_1.diff:

Smallest possible one-word change to warm-up...

replication_doku_2.diff:

Moves down "Clustering For Parallel Query Execution", because
it's not a replication type, but a feature, see explanation below.

replication_doku_3.diff:

This is the most important part, splitting all replication types
into single- and multi-master replication. I'm new to SGML, so
please bear with me if this is not the right way to do it...

"Shared-Disk-Failover" does IMO not fall into a replication category.
Should we mention there, that 'sharing' a disk using NFS or some
such is not recommended? (And more importantly, does not work as
a multi-master replication solution)

I've added a general paragraph describing Single-Master Replication.
I'm stating that 'Single-Master Replication is always asynchronous'.
Can anybody think of a counter example? Or a use case for sync
Single-Master Replication? The argument to put down is: if you go
sync, why don't you do Multi-Master right away?

Most of the "Clustering for Load Balancing" text applies to all
synchronous, Multi-Master Replication algorithms, even to
"Query Broadcasting". Thus it became the general description
of Multi-Master Replication. The section "Clustering for
Load Balancing" has been removed.

replication_doku_4.diff:

These are the text modifications I did to adjust to the new structure.
I've adjusted the Multi-Master Replication text to really be
appropriate for all existing solutions.

"Query Broadcasting" has some corrections, mainly to stick to describe
that algorithm there and none of the general properties of
Multi-Master Replication.

I've added two sections to describe 2PC and Distributed SHMEM
algorithms which belong into that category and cover all of the
previous text. Except that I've removed the mentioning of Oracle RAC
in favor of Pgpool-II.

IMO this makes it clearer, what replication types exist and how to
categorize them. I'm tempted to mention the Postgres-R algorithm as
fourth sub-section of Multi-Master Replication, as it's quite different
from all the others in many aspects. But I urgently need to do go to
work now... besides, I'm heavily biased regarding Postgres-R, so
probably someone else should write that paragraph. :-)

The only downside of the structure I'm proposing here is: the
non-replication-algorithms fall of somewhat. Namely: "Shared-Disk
Failover", "Data Partitioning", "Parallel Query Execution" and
"Commercial Solutions".

For me, "Data Partitioning" as well as "Parallel Query Execution" are
possible optimizations which can be run on top of replicated data. They
don't replicate data and are thus not replication solutions. But
grouping those two together would make sense.

So. I really have to go to work now!

Regards

Markus

Attachment Content-Type Size
replication_doku_1.diff text/x-patch 844 bytes
replication_doku_2.diff text/x-patch 3.3 KB
replication_doku_3.diff text/x-patch 4.9 KB
replication_doku_4.diff text/x-patch 5.3 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rune Bromer 2006-11-15 11:06:13 Re: Segmentation fault with HEAD.
Previous Message Markus Schiltknecht 2006-11-15 09:57:38 Re: [HACKERS] Replication documentation addition

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2006-11-16 18:04:19 Caveat Caveat
Previous Message Pavan Deolasee 2006-11-14 19:01:22 Frequent Update - Heap Overflow Tuple (HOT) patch