Re: Review of Synchronous Replication patches

From: zb(at)cybertec(dot)at
To: "Yeb Havinga" <yebhavinga(at)gmail(dot)com>
Cc: zb(at)cybertec(dot)at, "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org>, hs(at)cybertec(dot)at
Subject: Re: Review of Synchronous Replication patches
Date: 2010-07-24 13:06:20
Message-ID: 62086e5fc0b8e2355f9f5d055e36d76f.squirrel@internal.cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Hello Zoltán,
>
> Thanks for your reply!
>> Instead, I will post a patch that unifies my configuration choices with
>> Fujii's patch.
> Please know that Fujii's patch is also a work in progress.

Yes, I know that. But working from Fujii's last public patch
or from his GIT tree makes my patch easier to merge.

> I didn't
> mention in my review the previously discussed items, most important the
> changing the polling loops (see e.g.
> http://archives.postgresql.org/pgsql-hackers/2010-07/msg00757.php).

>> Do you have suggestions for better worded GUCs?
>> "slave" seems to be phased out by "standby" for political correctness,
>> so
>> "synchronous_standby" instead of "synchronous_slave". You mentioned
>> "min_sync_replication_clients" -> "quorum_min_sync_standbys". What else?
>>
> The 'quorum_min_sync_standbys' came from a reply to the design of
> Fujii's patch on
> http://archives.postgresql.org/pgsql-hackers/2010-07/msg01167.php.
> However after thinking about it a bit more, I still fail to see the use
> case for a maximum quorum number. Having a 'max quorum' also seems to
> contradict common meanings of 'quorum', which in itself is a minimum,
> minimum number, least required number, lower limit.
>
> Having also sync in the name is useful, imho, so people know it's not
> about async servers. So then the name would become quorum_sync_standbys
> or synchronous_standby_quorum.

Ok.

> Though I think I wouldn't use 'strict_sync_replication' (explained in
> http://archives.postgresql.org/pgsql-hackers/2010-04/msg01516.php) I can
> imagine others want this feature, to not have the master halt by sync
> standby failure. If that's what somebody never want, than the way this
> is solved by this parameter is elegant: only wait if they are connected.

Thanks, I was thinking about possible use cases and different preferences
of people.

> In recovery.conf, a boolean to discern between sync and async servers
> (like in your patch) is IMHO better than mixing 'sync or async' with the
> replication modes. Together with the replication modes, this could then
> become
> synchronous_standby (boolean)
> synchronous_mode (recv,fsync,replay)

Ok.

>> You also noticed that my patch addressed 2PC, maybe I will have to add
>> this part to Fujii's patch, too. Note: I haven't yet read his patch,
>> maybe working with LSNs instead of XIDs make this work automatically,
>> I don't know.
> Yes, and I think we definately need automated replicated cluster tests.
> A large part of reviewing went into virtual machine setup and cluster
> setup. I'm not sure if a full test suite that includes node failures
> could or should be included in the core regression test, but anything
> that automates setup, a few transactions and failures would benefit
> everyone working and testing on replication.
>
> regards,
> Yeb Havinga

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Adriano Lange 2010-07-24 13:20:01 TwoPO: experimental join order algorithm
Previous Message Peter Eisentraut 2010-07-24 12:23:48 Re: Functional dependencies and GROUP BY