Quick Links

Re: Quorum commit for multiple synchronous replication.

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Quorum commit for multiple synchronous replication.
Date:	2016-12-08 07:39:45
Message-ID:	CAB7nPqTwDHDDNv6rkCPUg4p03=e4kzU0BN_5KLnnAR8FkmndMw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Dec 8, 2016 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> You could do that, but first I would code up the simplest, cleanest
> algorithm you can think of and see if it even shows up in a 'perf'
> profile. Microbenchmarking is probably overkill here unless a problem
> is visible on macrobenchmarks.

This is what I would go for! The current code is doing a simple thing:
select the Nth element using qsort() after scanning each WAL sender's
values. And I think that Sawada-san got it right. Even running on my
laptop a pgbench run with 10 sync standbys using a data set that fits
into memory, SyncRepGetOldestSyncRecPtr gets at most 0.04% of overhead
using perf top on a non-assert, non-debug build. Hash tables and
allocations get a far larger share. Using the patch,
SyncRepGetSyncRecPtr is at the same level with a quorum set of 10
nodes. Let's kick the ball for now. An extra patch could make things
better later on if that's worth it.
--
Michael

In response to

Re: Quorum commit for multiple synchronous replication. at 2016-12-08 00:07:28 from Robert Haas

Responses

Re: Quorum commit for multiple synchronous replication. at 2016-12-08 09:32:47 from Masahiko Sawada

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2016-12-08 07:55:44	Re: tuplesort_gettuple_common() and *should_free argument
Previous Message	Craig Ringer	2016-12-08 07:11:12	Re: varlena beyond 1GB and matrix