Re: Issues with Quorum Commit

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Issues with Quorum Commit
Date: 2010-10-05 20:45:45
Message-ID: 1286311545.9356.15.camel@jdavis-ux.asterdata.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2010-10-05 at 12:11 -0700, Josh Berkus wrote:
> B. Eventual Inconsistency
> -------------------------
> If we have a quorum commit, it's possible for any individual standby to
> be indefinitely ahead of any standby which is not needed by the quorum.
> This means that:
>
> -- There is no clear criteria for when a standby which is not needed for
> quorum should be considered no longer a synch standby, and
> -- Applications cannot make assumptions that synch rep promises some
> specific window of synchronicity, eliminating a lot of the value of
> quorum commit.

Point B seems particularly dangerous.

When you lose one of the systems and the lagging server becomes required
for quorum, then all of a sudden you could be facing a huge delay to
commit the next transaction (because it needs to catch up on a lot of
WAL replay). This can happen even without a network problem at all, and
seems very likely to result in the lagging system being considered
"down" due to a timeout. Not good, because the reason it is required for
quorum is because another standby just went down.

In other words, a lagging standby combined with a timeout mechanism is
essentially useless, because it will never catch up in time to be a part
of the quorum.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-10-05 20:50:49 Re: leaky views, yet again
Previous Message Magnus Hagander 2010-10-05 20:44:48 Re: querying the version of libpq