Re: Review of Synchronous Replication patches

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: zb(at)cybertec(dot)at
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, hs(at)cybertec(dot)at
Subject: Re: Review of Synchronous Replication patches
Date: 2010-07-24 12:17:51
Message-ID: 4C4AD9EF.7070204@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Zoltán,

Thanks for your reply!
> Instead, I will post a patch that unifies my configuration choices with
> Fujii's patch.
Please know that Fujii's patch is also a work in progress. I didn't
mention in my review the previously discussed items, most important the
changing the polling loops (see e.g.
http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php).
> Do you have suggestions for better worded GUCs?
> "slave" seems to be phased out by "standby" for political correctness, so
> "synchronous_standby" instead of "synchronous_slave". You mentioned
> "min_sync_replication_clients" -> "quorum_min_sync_standbys". What else?
>
The 'quorum_min_sync_standbys' came from a reply to the design of
Fujii's patch on
http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php.
However after thinking about it a bit more, I still fail to see the use
case for a maximum quorum number. Having a 'max quorum' also seems to
contradict common meanings of 'quorum', which in itself is a minimum,
minimum number, least required number, lower limit.

Having also sync in the name is useful, imho, so people know it's not
about async servers. So then the name would become quorum_sync_standbys
or synchronous_standby_quorum.

Though I think I wouldn't use 'strict_sync_replication' (explained in
http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can
imagine others want this feature, to not have the master halt by sync
standby failure. If that's what somebody never want, than the way this
is solved by this parameter is elegant: only wait if they are connected.

In recovery.conf, a boolean to discern between sync and async servers
(like in your patch) is IMHO better than mixing 'sync or async' with the
replication modes. Together with the replication modes, this could then
become
synchronous_standby (boolean)
synchronous_mode (recv,fsync,replay)

> You also noticed that my patch addressed 2PC, maybe I will have to add
> this part to Fujii's patch, too. Note: I haven't yet read his patch,
> maybe working with LSNs instead of XIDs make this work automatically,
> I don't know.
Yes, and I think we definately need automated replicated cluster tests.
A large part of reviewing went into virtual machine setup and cluster
setup. I'm not sure if a full test suite that includes node failures
could or should be included in the core regression test, but anything
that automates setup, a few transactions and failures would benefit
everyone working and testing on replication.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2010-07-24 12:23:48 Re: Functional dependencies and GROUP BY
Previous Message Andrew Geery 2010-07-24 10:47:32 Re: Review: Patch for phypot - Pygmy Hippotause