Re: Transaction-controlled robustness for replication

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Wanner <markus(at)bluegap(dot)ch>, Robert Hodges <robert(dot)hodges(at)continuent(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jens-Wolfhard Schicke <drahflow(at)gmx(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transaction-controlled robustness for replication
Date: 2008-08-13 05:22:46
Message-ID: 20080813052246.GE9468@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > Tom Lane wrote:
> >> What I think Simon was actually driving at was query-shipping, which is
> >> not my idea of "WAL" at all. It has some usefulness, but also a bunch
> >> of downsides of its very own, mostly centered around reproducibility.
>
> > Actually I think the idea here is to take certain WAL records, translate
> > them into "portable" constructs, ship them, and let the slave handle the
> > remaining tasks that need to be done with it. For example you would
> > only ship heap insert, not index insert; the slave needs to take this
> > insert and derive the appropriate index operations that the slave needs.
>
> Oooh, so we'll run user-defined index functions during WAL replay.
> Yessir, *that* will be reliable and reproducible.

Hmm, I don't know what Simon was thinking, but I think it would be
acceptable to both parties to have WAL stay as it currently is (and used
for crash recovery), and only the logical record (derived from this WAL
record) be sent to the slave.

The reason for not having the "logical record" be shipped straight away
at the time of operation is that this system would need to resume
sending logical record if the shipping system crashed (i.e. some
committed transactions are in WAL but have not been shipped yet.)

> In any case, you didn't answer the point about heap TIDs not matching
> across architectures. That seems at minimum to require that UPDATE
> and DELETE identify target tuples by primary key instead of TID.

Yep -- the PK would be required. Alternatively, one could accept tables
that have no PKs but for which no UPDATEs or DELETEs are allowed on the
shipping system. (This could be useful for log-type tables that you
want to replicate.)

> Which requires for starters that all your tables *have* a primary
> key, and for seconds that the replay environment be capable of
> identifying the pkey and being able to do lookup operations using it.

Possibly the requirement would be that the replay system would have the
same PK as the shipping system.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2008-08-13 05:29:02 Re: Plugin system like Firefox
Previous Message Tom Lane 2008-08-13 04:48:24 Re: Transaction-controlled robustness for replication