Re: 2-phase commit

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, Andrew Sullivan <andrew(at)libertyrms(dot)info>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 2-phase commit
Date: 2003-09-27 14:47:03
Message-ID: 200309271447.h8REl3I23256@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Richard Huxton wrote:
> > [itch...] But you surely cannot guarantee that the slave and the master
> > time out at exactly the same femtosecond. What happens when the comm
> > link comes back online just when one has timed out and the other not?
> > (Hint: in either order, it ain't good. Double plus ungood if, say, the
> > comm link manages to deliver the master's "commit confirm" message a
> > little bit after the master has timed out and decided to abort after all.)
> >
> > In my book, timeout-based solutions to this kind of problem are certain
> > disasters.
>
> I might be (well, am actually) a bit out of my depth here, but surely what
> happens is if you have machines A,B,C and *any* of them thinks machine C has
> a problem then it does. If C can still communicate with the others then it is
> told to reinitialise/go away/start the sirens. If C can't communicate then
> it's all a bit academic.
>
> Granted, if you have intermittent problems on a link and set your timeouts
> badly then you'll have a very brittle system, but if A thinks C has died, you
> can't just reverse that decision.

I have been thinking it might be time to start allowing external
programs to be called when certain events occur that require
administrative attention --- this would be a good case for that.
Administrators could configure shell scripts to be run when the network
connection fails or servers drop off the network, alerting them to the
problem. Throwing things into the server logs isn't _active_ enough.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shridhar Daithankar 2003-09-27 14:53:20 Re: 2-phase commit
Previous Message Shridhar Daithankar 2003-09-27 14:26:01 Re: PL contribution guidelines?