Re: hot standby - merged up to CVS HEAD

From: David Fetter <david(at)fetter(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: hot standby - merged up to CVS HEAD
Date: 2009-08-27 19:15:50
Message-ID: 20090827191550.GA3886@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 27, 2009 at 07:08:28PM +0100, Simon Riggs wrote:
>
> On Mon, 2009-08-17 at 11:19 +0300, Heikki Linnakangas wrote:
>
> > I think there's a race condition in the way LogCurrentRunningXacts() is
> > called at the end of checkpoint. This can happen in the master:
> >
> > 1. Checkpoint starts
> > 2. Transaction 123 begins, and does some updates
> > 3. Checkpoint ends. LogCurrentRunningXacts() is called.
> > 4. LogCurrentRunningXacts() gets the list of currently running
> > transactions by calling GetCurrentTransactionData().
> > 5. Transaction 123 ends, writing commit record to WAL
> > 6. LogCurrentRunningXacts() writes the list of running XIDs to WAL. This
> > includes XID 123, since that was still running at step 4.
> >
> > When that is replayed, ProcArrayUpdateTransactions() will zap the
> > unobserved xids array with the list that includes XID 123, even though
> > we already saw a commit record for it.
>
> That's not a race condition, but it does make the code more complex. The
> issue has been long understood.
>
> I don't think it's acceptable to take and hold both ProcArray and
> WALInsertLock. Those are now the two most heavily contended locks on the
> system. We have evidence that there are burst delays associated with
> various operations on just one of those locks, let alone two.
>
> If you're still doubtful, the problem I've been working on recently is
> the point that I overlooked the initial state of the lock table in my
> earlier patch. GetRunningTransactionData() also needs to have initial
> lock data.
>
> There is no way in hell that I could personally condone holding
> ProcArrayLock, WALInsertLock and all of the LockMgrLock partitions at
> same time. So we just have to eat the complexity. (No doubt someone will
> disagree with my strong language here, but please take it as an
> indication of exactly how bad an idea holding multiple locks will be).
>
> Slight timing issues are not too bad really. We just have to be careful
> to assume that there is a mismatch in the data and must have code to
> handle that.
>
> Anyway, I've been working on this problem for some time and continue to
> do so.

Great! Where's the git repository?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2009-08-27 19:27:49 Re: MySQL Compatibility WAS: 8.5 release timetable, again
Previous Message Robert Haas 2009-08-27 19:04:20 Re: 8.5 release timetable, again