Re: race condition in sync rep

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: race condition in sync rep
Date: 2011-03-28 01:27:03
Message-ID: AANLkTikE0KVCNOB4o=ZqAX=8TQ9CjvAzriZML=oeNgpk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 27, 2011 at 7:46 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Are the master and standby on same system or are they separated by a network?
>
> I'm surprised that a network roundtrip takes less time than the
> backend takes to mark clog and then queue for the SyncRepLock.

When I first noticed that it was slow (really hanging, though I failed
to realize it) with fsync=off, I had two clusters on the system. I
didn't know what was going on at that point, and wasn't specifically
looking for this bug - I was actually testing some other aspect of the
behavior and hit upon it by accident. Then while I was at PG East I
realized there was a race condition. (I think I actually realized it
while I was dreaming about PostgreSQL; if you think dreaming about
PostgreSQL is a sign that something is seriously wrong with me, you
are likely correct.) Just to convince myself that I wasn't making
things up I then stuck a sleep(1) in right before the sync rep wait,
for testing purposes, which of course made it trivial to demonstrate
the hang; I again did that on the same system (different one) but of
course with the sleep in there it wouldn't have mattered. Then later
I realized that the race condition and the fsync=off were probably the
same problem, so I wrote up the email that way.

If your point is that I never demonstrated with sync rep between two
different systems, I agree. I suspect it could be done, but you'd
probably have to load the master down pretty heavily while keeping the
load on the standby very light - or possibly it would work to just run
a single-threaded test for a really long time, but I don't know
because I haven't tried it. I'm actually not that interested in
quantifying the exact probability of this happening under any given
set of circumstances; it seems like enough that it's been found and
fixed. If something in the phrasing of my original email gave
offence, it wasn't intended to: in particular, the use of the word
"nasty" was intended to convey "difficult to find; tricky". I think
my fear that it would prove difficult to fix also may have affected
that word choice; I didn't anticipate it being resolved so quickly and
with such a small patch.

I am doing my best to help fix the things that I believe to be bugs in
the code without pissing anybody off. Clearly, at least in your case,
that doesn't seem to have been entirely successful, but in all honesty
it's not for lack of trying. I really, really want to get this
release out the door and get back to writing code and doing
CommitFests; but I also want it to be good (as I'm sure you do as
well) and I think we're not quite there yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-28 01:30:58 Re: patch for createdb section in tutorial
Previous Message Greg Stark 2011-03-27 22:55:46 Re: Additional options for Sync Replication