Re: Big 7.4 items - Replication

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Al Sutton <al(at)alsutton(dot)com>
Cc: Darren Johnson <darren(at)up(dot)hrcoxmail(dot)com>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, shridhar_daithankar(at)persistent(dot)co(dot)in, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Big 7.4 items - Replication
Date: 2002-12-14 16:59:28
Message-ID: 200212141659.gBEGxS822196@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


This sounds like two-phase commit. While it will work, it is probably
slower than Postgres-R's method.

---------------------------------------------------------------------------

Al Sutton wrote:
> For live replication could I propose that we consider the systems A,B, and C
> connected to each other independantly (i.e. A has links to B and C, B has
> links to A and C, and C has links to A and B), and that replication is
> handled by the node receiving the write based transaction.
>
> If we consider a write transaction that arrives at A (called WT(A)), system
> A will then send WT(A) to systems B and C via it's direct connections.
> System A will receive back either an OK response if there are not conflicts,
> a NOT_OK response if there are conflicts, or no response if the system is
> unavailable.
>
> If system A receives a NOT_OK response from any other node it begins the
> process of rolling back the transaction from all nodes which previously
> issued an OK, and the transaction returns a failure code to the client which
> submitted WT(A). The other systems (B and C) would track recent transactions
> and there would be a specified timeout after which the transaction is
> considered safe and could not be rolled out.
>
> Any system not returning an OK or NOT_OK state is assumed to be down, and
> error messages are logged to state that the transaction could not be sent to
> the system due it it's unavailablility, and any monitoring system would
> alter the administrator that a replicant is faulty.
>
> There would also need to be code developed to ensure that a system could be
> brought into sync with the current state of other systems within the group
> in order to allow new databases to be added, and faulty databases to be
> re-entered to the group. This code could also be used for non-realtime
> replication to allow databases to be syncronised with the live master.
>
> This would give a multi-master solution whereby a write transaction to any
> one node would guarentee that all available replicants would also hold the
> data once it is completed, and would also provide the code to handle
> scenarios where non-realtime data replication is required.
>
> This system assumes that a majority of transactions will be sucessful (which
> should be the case for a well designed system).
>
> Comments?
>
> Al.
>
>
>
>
>
>
> ----- Original Message -----
> From: "Darren Johnson" <darren(at)up(dot)hrcoxmail(dot)com>
> To: "Jan Wieck" <JanWieck(at)Yahoo(dot)com>
> Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>;
> <shridhar_daithankar(at)persistent(dot)co(dot)in>; "PostgreSQL-development"
> <pgsql-hackers(at)postgresql(dot)org>
> Sent: Saturday, December 14, 2002 1:28 AM
> Subject: [mail] Re: [HACKERS] Big 7.4 items
>
>
> > >
> > >
> > >>
> > >>Lets say we have systems A, B and C. Each one has some
> > >>changes and sends a writeset to the group communication
> > >>system (GSC). The total order dictates WS(A), WS(B), and
> > >>WS(C) and the writes sets are recieved in that order at
> > >>each system. Now C gets WS(A) no conflict, gets WS(B) no
> > >>conflict, and receives WS(C). Now C can commit WS(C) even
> > >>before the commit messages C(A) or C(B), because there is no
> > >>conflict.
> > >>
> > >
> > >And that is IMHO not synchronous. C does not have to wait for A and B to
> > >finish the same tasks. If now at this very moment two new transactions
> > >query system A and system C (assuming A has not yet committed WS(C)
> > >while C has), they will get different data back (thanks to non-blocking
> > >reads). I think this is pretty asynchronous.
> > >
> >
> > So if we hold WS(C) until we receive commit messages for WS(A) and
> > WS(B), will that meet
> > your synchronous expectations, or do all the systems need to commit the
> > WS in the same order
> > and at the same exact time.
> >
> > >
> > >
> > >It doesn't lead to inconsistencies, because the transaction on A cannot
> > >do something that is in conflict with the changes made by WS(C), since
> > >it's WS(A)2 will come back after WS(C) arrived at A and thus WS(C)
> > >arriving at A will cause WS(A)2 to rollback (WS used synonymous to Xact
> > >in this context).
> > >
> > Right
> >
> > >
> > >Hope this doesn't add too much confusion :-)
> > >
> > No, however I guess I need to adjust my slides to include your
> > definition of synchronous
> > replication. ;-)
> >
> > Darren
> >
> > >
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mathieu Arnold 2002-12-14 17:03:18 Re: Big 7.4 items - Replication
Previous Message Devrim GÜNDÜZ 2002-12-14 16:56:24 Re: [GENERAL] PostgreSQL Global Development Group