Re: [mail] Re: Big 7.4 items - Replication

From: "Al Sutton" <al(at)alsutton(dot)com>
To: "Jonathan Stanton" <jonathan(at)cnds(dot)jhu(dot)edu>
Cc: "Darren Johnson" <darren(at)up(dot)hrcoxmail(dot)com>, "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Jan Wieck" <JanWieck(at)Yahoo(dot)com>, <shridhar_daithankar(at)persistent(dot)co(dot)in>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [mail] Re: Big 7.4 items - Replication
Date: 2002-12-15 19:42:35
Message-ID: 000701c2a472$1ed24940$0100a8c0@cloud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jonathan,

How do the group communication daemons on system A and B agree that T2 is
after T1?,

As I understand it the operation is performed locally before being passed on
to the group for replication, when T2 arrives at system B, system B has no
knowlege of T1 and so can perform T2 sucessfully.

I am guessing that the System B performs T2 locally, sends it to the group
communication daemon for ordering, and then receives it back from the group
communication order queue after it's position in the order queue has been
decided before it is written to the database.

This would indicate to me that there is a single central point which decides
that T2 is after T1.

Is this true?

Al.

----- Original Message -----
From: "Jonathan Stanton" <jonathan(at)cnds(dot)jhu(dot)edu>
To: "Al Sutton" <al(at)alsutton(dot)com>
Cc: "Darren Johnson" <darren(at)up(dot)hrcoxmail(dot)com>; "Bruce Momjian"
<pgman(at)candle(dot)pha(dot)pa(dot)us>; "Jan Wieck" <JanWieck(at)Yahoo(dot)com>;
<shridhar_daithankar(at)persistent(dot)co(dot)in>; "PostgreSQL-development"
<pgsql-hackers(at)postgresql(dot)org>
Sent: Sunday, December 15, 2002 5:00 PM
Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

> The total order provided by the group communication daemons guarantees
> that every member will see the tranactions/writesets in the same order.
> So both A and B will see that T1 is ordered before T2 BEFORE writing
> anything back to the client. So for both servers T1 will be completed
> successfully, and T2 will be aborted because of conflicting writesets.
>
> Jonathan
>
> On Sun, Dec 15, 2002 at 10:16:22AM -0000, Al Sutton wrote:
> > Many thanks for the explanation. Could you explain to me where the order
or
> > the writeset for the following scenario;
> >
> > If a tranasction takes 50ms to reach one database from another, for a
> > specific data element (called X), the following timeline occurs
> >
> > at 0ms, T1(X) is written to system A.
> > at 10ms, T2(X) is written to system B.
> >
> > Where T1(X) and T2(X) conflict.
> >
> > My concern is that if the Group Communication Daemon (gcd) is operating
on
> > each database, a successful result for T1(X) will returned to the
client
> > talking to database A because T2(X) has not reached it, and thus no
conflict
> > is known about, and a sucessful result is returned to the client
submitting
> > T2(X) to database B because it is not aware of T1(X). This would mean
that
> > the two clients beleive bothe T1(X) and T2(X) completed succesfully, yet
> > they can not due to the conflict.
> >
> > Thanks,
> >
> > Al.
> >
> > ----- Original Message -----
> > From: "Darren Johnson" <darren(at)up(dot)hrcoxmail(dot)com>
> > To: "Al Sutton" <al(at)alsutton(dot)com>
> > Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>; "Jan Wieck"
> > <JanWieck(at)Yahoo(dot)com>; <shridhar_daithankar(at)persistent(dot)co(dot)in>;
> > "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
> > Sent: Saturday, December 14, 2002 6:48 PM
> > Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication
> >
> >
> > > >
> > > >
> > > >
> > > >b) The Group Communication blob will consist of a number of processes
> > which
> > > >need to talk to all of the others to interrogate them for changes
which
> > may
> > > >conflict with the current write that being handled and then issue the
> > > >transaction response. This is basically the two phase commit solution
> > with
> > > >phases moved into the group communication process.
> > > >
> > > >I can see the possibility of using solution b and having less group
> > > >communication processes than databases as attempt to simplify things,
but
> > > >this would mean the loss of a number of databases if the machine
running
> > the
> > > >group communication process for the set of databases is lost.
> > > >
> > > The group communication system doesn't just run on one system. For
> > > postgres-r using spread
> > > there is actually a spread daemon that runs on each database server.
It
> > > has nothing to do with
> > > detecting the conflicts. Its job is to deliver messages in a total
> > > order for writesets or simple order
> > > for commits, aborts, joins, etc.
> > >
> > > The detection of conflicts will be done at the database level, by a
> > > backend processes. The basic
> > > concept is "if all databases get the writesets (changes) in the exact
> > > same order, apply them in a
> > > consistent order, avoid conflicts, then one copy serialization is
> > > achieved. (one copy of the database
> > > replicated across all databases in the replica)
> > >
> > > I hope that explains the group communication system's responsibility.
> > >
> > > Darren
> > >
> > >
> > > >
> > >
> > >
> > >
> > > ---------------------------(end of
broadcast)---------------------------
> > > TIP 5: Have you checked our extensive FAQ?
> > >
> > > http://www.postgresql.org/users-lounge/docs/faq.html
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
>
> --
> -------------------------------------------------------
> Jonathan R. Stanton jonathan(at)cs(dot)jhu(dot)edu
> Dept. of Computer Science
> Johns Hopkins University
> -------------------------------------------------------
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message cbbrowne 2002-12-15 21:40:33 Re: [GENERAL] PostgreSQL Global Development Group
Previous Message Peter Eisentraut 2002-12-15 16:22:31 Re: Information schema now available