Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit

From: "Ed L(dot)" <pgsql(at)bluepolka(dot)net>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-general(at)postgresql(dot)org>, Steven Singer <ssinger(at)navtechinc(dot)com>
Subject: Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit
Date: 2003-04-12 01:32:15
Message-ID: 200304111932.15528.pgsql@bluepolka.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Friday April 11 2003 6:08, Stephan Szabo wrote:
> On Fri, 11 Apr 2003, Ed L. wrote:
> > For those of you who do see the validity in batching multiple
> > transactions, my question is restated here:
> >
> > My question: Is there an ordering algorithm that would make a
> > consistent but limited batchsize replication possible? I propose one
> > below.
> >
> > Suppose you have a replication queue on the master, with dbmirror-like
> > trigger-based insertions, that looks something like this:
> >
> > create table replication_queue (
> > xid integer,
> > seqid serial primary key,
> > data...
> > );
> >
> > Here, 'xid' is the transaction ID, 'seqid' is the queue insertion
> > order, and 'data' has all the info needed to replicate the update.
> > Every row update/delete/inserte results in a row in this queue, and a
> > transaction may consist of one to many rows in the queue.
> >
> > The algorithm I'm considering right now is the following:
> >
> > select xid, max(seqid) as "max_seqid"
> > into temp replication_order
> > from replication_queue
> > group by xid
> > order by max(seqid)
> > limit N;
> >
> > Then, to get the actual queue replication order,
> >
> > select q.xid, q.seqid, q.data
> > from replication_queue q, replication_order o
> > where q.xid = o.xid
> > order by o.max_seqid, q.seqid;
> >
> > [This is a batched variation of dbmirror's original algorithm.]
> >
> > So, replication is done by transaction groupings, in ascending order
> > according to the maximum seqid in each transaction. I'm hoping someone
> > can poke holes in this algorithm if they exist.
>
> Does it matter if transactions that do not affect each other are
> committed on the slave in a different order than they were on the master?
> I don't think that's guaranteed by the above (unless the inserts into
> replication_queue were deferred to transaction end), but I'm also
> uncertain if it's a constraint you're concerned with.

I appreciate your pointing that out. It is pretty undesirable for data to
appear on the slave in an order different from the one in which it appears
on the master. I guess that's another downside to batching. I'm not sure
this approach can do any better than approximating the order since there is
no knowledge of the commit order.

Ed

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Fitzmyers 2003-04-12 02:22:10 Re: [ADMIN] PLEASE HELP ME URGENT about choosing only the ones
Previous Message Stephan Szabo 2003-04-12 00:08:49 Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit