Re: Quorum commit for multiple synchronous replication.

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Quorum commit for multiple synchronous replication.
Date: 2016-12-08 07:39:45
Message-ID: CAB7nPqTwDHDDNv6rkCPUg4p03=e4kzU0BN_5KLnnAR8FkmndMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 8, 2016 at 9:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> You could do that, but first I would code up the simplest, cleanest
> algorithm you can think of and see if it even shows up in a 'perf'
> profile. Microbenchmarking is probably overkill here unless a problem
> is visible on macrobenchmarks.

This is what I would go for! The current code is doing a simple thing:
select the Nth element using qsort() after scanning each WAL sender's
values. And I think that Sawada-san got it right. Even running on my
laptop a pgbench run with 10 sync standbys using a data set that fits
into memory, SyncRepGetOldestSyncRecPtr gets at most 0.04% of overhead
using perf top on a non-assert, non-debug build. Hash tables and
allocations get a far larger share. Using the patch,
SyncRepGetSyncRecPtr is at the same level with a quorum set of 10
nodes. Let's kick the ball for now. An extra patch could make things
better later on if that's worth it.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2016-12-08 07:55:44 Re: tuplesort_gettuple_common() and *should_free argument
Previous Message Craig Ringer 2016-12-08 07:11:12 Re: varlena beyond 1GB and matrix