Re: Quorum commit for multiple synchronous replication.

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Quorum commit for multiple synchronous replication.
Date: 2016-12-06 04:11:20
Message-ID: CAHGQGwFqHGKn603-ufy+25hHRc6q__fjk9NRyMbUxnDEh9=N=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 28, 2016 at 8:03 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Sat, Nov 26, 2016 at 10:27 PM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>> Attached latest version patch incorporated review comments. After more
>>> thought, I agree and changed the value of standby priority in quorum
>>> method so that it's not set 1 forcibly. The all standby priorities are
>>> 1 If s_s_names = 'ANY(*)'.
>>> Please review this patch.
>>
>> Sorry for my late reply. Here is my final lookup.
>
> Thank you for reviewing!
>
>> <synopsis>
>> -<replaceable class="parameter">num_sync</replaceable> ( <replaceable
>> class="parameter">standby_name</replaceable> [, ...] )
>> +[ANY] <replaceable class="parameter">num_sync</replaceable> (
>> <replaceable class="parameter">standby_name</replaceable> [, ...] )
>> +FIRST <replaceable class="parameter">num_sync</replaceable> (
>> <replaceable class="parameter">standby_name</replaceable> [, ...] )
>> <replaceable class="parameter">standby_name</replaceable> [, ...
>> This can just be replaced with [ ANY | FIRST ]. There is no need for
>> braces as the keyword is not mandatory.
>>
>> + is the name of a standby server.
>> + <literal>FIRST</> and <literal>ANY</> specify the method used by
>> + the master to control the standby servres.
>> </para>
>> s/servres/servers/.
>>
>> if (priority == 0)
>> values[7] = CStringGetTextDatum("async");
>> else if (list_member_int(sync_standbys, i))
>> - values[7] = CStringGetTextDatum("sync");
>> + values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY ?
>> + CStringGetTextDatum("sync") : CStringGetTextDatum("quorum");
>> else
>> values[7] = CStringGetTextDatum("potential");
>> This can be simplified a bit as "quorum" is the state value for all
>> standbys with a non-zero priority when the method is set to
>> SYNC_REP_QUORUM:
>> if (priority == 0)
>> values[7] = CStringGetTextDatum("async");
>> + else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM)
>> + values[7] = CStringGetTextDatum("quorum");
>> else if (list_member_int(sync_standbys, i))
>> values[7] = CStringGetTextDatum("sync");
>> else
>>
>> SyncRepConfig data is made external to syncrep.c with this patch as
>> walsender.c needs to look at the sync method in place, no complain
>> about that after considering if there could be a more elegant way to
>> do things without this change.
>
> Agreed.
>
>> While reviewing the patch, I have found a couple of incorrectly shaped
>> sentences, both in the docs and some comments. Attached is a new
>> version with this word-smithing. The patch is now switched as ready
>> for committer.
>
> Thanks. I found a typo in v7 patch, so attached latest v8 patch.

+ qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn);
+ qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn);
+ qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn);

In quorum commit, we need to calculate the N-th largest LSN from
M quorum synchronous standbys' LSN. N would be usually 1 - 3 and
M would be 1 - 10, I guess. You used the algorithm using qsort for
that calculation. But I'm not sure if that's enough effective algorithm
or not.

If M (i.e., number of quorum sync standbys) is enough large,
your choice would be good. But usually M seems not so large.

Thought?

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-12-06 04:17:30 Re: Typmod associated with multi-row VALUES constructs
Previous Message Andres Freund 2016-12-06 03:53:28 Re: Logical Replication WIP