Re: Quorum commit for multiple synchronous replication.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Quorum commit for multiple synchronous replication.
Date: 2016-11-16 03:08:01
Message-ID: CAD21AoBGXB286o1eL0vGidg05iWUzXGebMOB3pYeD5cq9Ocerg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2016 at 5:39 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Tue, Nov 8, 2016 at 10:12 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Tue, Nov 8, 2016 at 12:25 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> On Tue, Oct 25, 2016 at 10:35 PM, Michael Paquier
>>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>>> + if (SyncRepConfig->sync_method == SYNC_REP_PRIORITY)
>>>> + return SyncRepGetSyncStandbysPriority(am_sync);
>>>> + else /* SYNC_REP_QUORUM */
>>>> + return SyncRepGetSyncStandbysQuorum(am_sync);
>>>> Both routines share the same logic to detect if a WAL sender can be
>>>> selected as a candidate for sync evaluation or not, still per the
>>>> selection they do I agree that it is better to keep them as separate.
>>>>
>>>> + /* In quroum method, all sync standby priorities are always 1 */
>>>> + if (found && SyncRepConfig->sync_method == SYNC_REP_QUORUM)
>>>> + priority = 1;
>>>> Honestly I don't understand why you are enforcing that. Priority can
>>>> be important for users willing to switch from ANY to FIRST to have a
>>>> look immediately at what are the standbys that would become sync or
>>>> potential.
>>>
>>> I thought that since all standbys appearing in s_s_names list are
>>> treated equally in quorum method, these standbys should have same
>>> priority.
>>> If these standby have different sync_priority, it looks like that
>>> master server replicates to standby server based on priority.
>>
>> No actually, because we know that they are a quorum set, and that they
>> work in the same set. The concept of priorities has no real meaning
>> for quorum as there is no ordering of the elements. Another, perhaps
>> cleaner idea may be to mark the field as NULL actually.
>
> We know that but I'm concerned it might confuse the user.
> If these priorities are the same, it can obviously imply that all of
> the standby listed in s_s_names are handled equally.
>
>>>> It is not possible to guess from how many standbys this needs to wait
>>>> for. One idea would be to mark the sync_state not as "quorum", but
>>>> "quorum-N", or just add a new column to indicate how many in the set
>>>> need to give a commit confirmation.
>>>
>>> As Simon suggested before, we could support another feature that
>>> allows the client to control the quorum number.
>>> Considering adding that feature, I thought it's better to have and
>>> control that information as a GUC parameter.
>>> Thought?
>>
>> Similarly that would be a SIGHUP parameter? Why not. Perhaps my worry
>> is not that much legitimate, users could just look at s_s_names to
>> guess how many in hte set a commit needs to wait for.
>
> It would be PGC_USRSET similar to synchronous_commit. The user can
> specify it in statement level.
>
>> + <para>
>> + <literal>FIRST</> and <literal>ANY</> are case-insensitive word
>> + and the standby name having these words are must be double-quoted.
>> + </para>
>> s/word/words/.
>>
>> + <literal>FIRST</> and <literal>ANY</> specify the method of
>> + that how master server controls the standby servers.
>> A little bit hard to understand, I would suggest:
>> FIRST and ANY specify the method used by the master to control the
>> standby servers.
>>
>> + The keyword <literal>FIRST</>, coupled with an integer
>> + number N higher-priority standbys and makes transaction commit
>> + when their WAL records are received.
>> This is unclear to me. Here is a correction:
>> The keyword FIRST, coupled with an integer N, makes transaction commit
>> wait until WAL records are received fron the N standbys with higher
>> priority number.
>>
>> + <varname>synchronous_standby_names</>. For example, a setting
>> + of <literal>ANY 3 (s1, s2, s3, s4)</> makes transaction commits
>> + wait until receiving receipts from at least any three standbys
>> + of four listed servers <literal>s1</>, <literal>s2</>, <literal>s3</>,
>> This could just mention WAL records instead of "receipts".
>>
>> Instead of saying "an integer number N", we could use <literal>num_sync</>.
>>
>> + If <literal>FIRST</> or <literal>ANY</> are not specified,
>> this parameter
>> + behaves as <literal>ANY</>. Note that this grammar is
>> incompatible with
>> + <productname>PostgresSQL</> 9.6, where no keyword specified
>> is equivalent
>> + as if <literal>FIRST</> was specified. In short, there is no
>> real need to
>> + specify <replaceable class="parameter">num_sync</replaceable> as this
>> + behavior does not have changed, as well as it is not
>> necessary to mention
>> + pre-9.6 versions are the multi-sync grammar has been added in 9.6.
>> This paragraph could be reworked, say:
>> if FIRST or ANY are not specified this parameter behaves as if ANY is
>> used. Note that this grammar is incompatible with PostgreSQL 9.6 which
>> is the first version supporting multiple standbys with synchronous
>> replication, where no such keyword FIRST or ANY can be used. Note that
>> the grammar behaves as if FIRST is used, which is incompatible with
>> the post-9.6 behavior.
>>
>> + <entry>Synchronous state of this standby server. <literal>quorum-N</>
>> + , where N is the number of synchronous standbys that transactions
>> + need to wait for replies from, when standby is considered as a
>> + candidate of quorum commit.</entry>
>> Nitpicking: I think that the comma goes to the previous line if it is
>> the first character of a line.
>>
>> + if (SyncRepConfig->sync_method == SYNC_REP_PRIORITY)
>> + return SyncRepGetSyncStandbysPriority(am_sync);
>> + else /* SYNC_REP_QUORUM */
>> + return SyncRepGetSyncStandbysQuorum(am_sync)
>> Or that?
>> if (PRIORITY)
>> return StandbysPriority();
>> else if (QUORUM)
>> return StandbysQuorum();
>> else
>> elog(ERROR, "Boom");
>>
>> + * In priority method, we need the oldest these positions among sync
>> + * standbys. In quorum method, we need the newest these positions
>> + * specified by SyncRepConfig->num_sync.
>> Last sentence is grammatically incorrect, and it would be more correct
>> to precise the Nth LSN positions to be able to select k standbys from
>> a set of n ones.
>>
>> + SpinLockAcquire(&walsnd->mutex);
>> + write_array[i] = walsnd->write;
>> + flush_array[i]= walsnd->flush;
>> + apply_array[i] = walsnd->flush;
>> + SpinLockRelease(&walsnd->mutex);
>> A nit: adding a space on the self of the second = character. And you
>> need to save the apply position of the WAL sender, not the flush
>> position in the array that is going to be ordered.
>>
>> /*
>> * More easily understood version of standby state. This is purely
>> - * informational, not different from priority.
>> + * informational. In quorum method, we add the number to indicate
>> + * how many in the set need to give a commit confirmation.
>> */
>> if (priority == 0)
>> values[7] = CStringGetTextDatum("async");
>> else if (list_member_int(sync_standbys, i))
>> - values[7] = CStringGetTextDatum("sync");
>> + values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY ?
>> + CStringGetTextDatum("sync") : CStringGetTextDatum("quorum")
>> This code block and its explanation comments tell two different
>> stories. The comment is saying that something like "quorum-N" is used
>> but the code always prints "quorum".
>>
>> It may be a good idea in the test to check that when no keywords is
>> specified the group of standbys is in quorum mode.
>
> Yeah, I will add some tests.
>
> I will post new version patch incorporated other comments.
>

Attached latest version patch incorporated review comments. After more
thought, I agree and changed the value of standby priority in quorum
method so that it's not set 1 forcibly. The all standby priorities are
1 If s_s_names = 'ANY(*)'.
Please review this patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
000_quorum_commit_v6.patch application/x-patch 27.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Kirkwood 2016-11-16 03:45:24 Re: WIP: About CMake v2
Previous Message Mark Kirkwood 2016-11-16 03:00:48 Re: WIP: About CMake v2