Re: Transaction-controlled robustness for replication

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Robert Hodges <robert(dot)hodges(at)continuent(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Jens-Wolfhard Schicke <drahflow(at)gmx(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transaction-controlled robustness for replication
Date: 2008-08-13 13:11:41
Message-ID: 1218633101.5343.341.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Wed, 2008-08-13 at 11:27 +0200, Markus Wanner wrote:
> Hi,
>
> Robert Hodges wrote:
> > Part of this is semantics—I like Simon’s logical vs. physical
> > terminology because it distinguishes neatly between replication that
> > copies implementation down to OIDs etc. and replication that copies data
> > content including schema changes but not implementation.
>
> So far, these terms have mostly caused confusion for me: "logical
> replication using WAL shipping", "physical replication, but logical
> application"...
>
> As Simon didn't explain in more details, what he has in mind, we all
> have our own and quite different interpretations. These terms obviously
> haven't helped to clarify the issue until now.

Classification of Replication Techniques
---------------------------------------

We can consider that there are two stages to replication
* transfer or shipping of data to second node(s)
* apply mechanism

Apply mechanism can be either logical, where we execute SQL, or
physical, where we bypass the SQL layer and do this at lower level.

It is possible to have
1. logical shipping, logical apply
2. physical shipping, logical apply
3. physical shipping, physical apply

Mammoth and Postgres-R are both type 1 replication systems, since they
stream data to second node in a form that makes SQL reconstruction
easier. Slony is also type 1, using triggers.
So we have another classification

1 Logical data shipping
a) Trigger based
b) Additional internals based approaches

"Warm Standby" log shipping is type 3. We ship the WAL and apply it
directly. Easy and good.

Type 2 is where you ship the WAL (efficient) then use it to reconstruct
SQL (flexible) and then apply that to other nodes. It is somewhat harder
than type 1, but requires less infrastructure (IMHO). Definitely
requires less data shipping from Primary node, so very possibly more
efficient. [Tom is absolutely right to say this is "impossible".
Currently, it is, but that's why we develop].

1a can't be synchronous by definition, but 1b and other approaches can
be. Otherwise above definitions not relevant to sync/async capability.

Note that DRBD is also a type 3 system, but uses filesystem level
physical data. Since this is outside of control of Postgres, I discount
this approach because it cannot deliver transaction-controlled synch
replication.

Previously, most RDBMS vendors supported type 1a) systems. They have now
moved to type 2 and 3 systems. Both DB2 and Oracle support a type 2
*and* a type 3 replication system. The reasons they do this are valid
for us also, so I suggest that we do the same. So for me, it is not
about whether we do type 2 or type 3, I think we should do both.

Doing both may not be possible in this release: I have one potential
sponsor for type 2, and have made public call for work on query access
to type 3 systems. I recognise that some people that like type 3 do not
like type 2 and possibly vice versa.

If (not when) I do work on type 2 systems it will be only to provide
that as a "transport" option for other technologies. Not as a full-blown
replication system.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2008-08-13 13:34:55 Re: temporary statistics option at initdb time
Previous Message Dmitry Koterov 2008-08-13 11:29:48 Patch: propose to include 3 new functions into intarray and intagg