Re: Proposal: "Causal reads" mode for load balancing reads without stale data

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thom Brown <thom(at)linux(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date: 2016-03-26 22:30:20
Message-ID: CAEepm=35nKaYCV95Psgq8LSLq1rQ5cgAvuSFgUCeT=b3w3ZMKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 26, 2016 at 2:48 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Fri, Mar 25, 2016 at 4:51 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> On Thu, Mar 24, 2016 at 12:34 AM, Thomas Munro
>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>> On Wed, Mar 23, 2016 at 12:37 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>> +static void WalRcvUnblockSigUsr2(void)
>>>>
>>>> And again here.
>>>
>>> Fixed.
>>>
>>>> + WalRcvUnblockSigUsr2();
>>>> len = walrcv_receive(NAPTIME_PER_CYCLE, &buf);
>>>> + WalRcvBlockSigUsr2();
>>>>
>>>> This does not seem like it will be cheap on all operating systems. I
>>>> think you should try to rejigger this somehow so that it can just set
>>>> the process latch and the wal receiver figures it out from looking at
>>>> shared memory. Like maybe a flag in WalRcvData? An advantage of this
>>>> is that it should cut down on the number of signals significantly,
>>>> because it won't need to send SIGUSR1 when the latch is already set.
>>>
>>> Still experimenting with a latch here. I will come back on this point soon.
>>
>> Here is a latch-based version.
>
> Thanks for the updated version. This looks pretty nice.
>
> I find the routine name libpqrcv_wait to be a bit confusing. This is
> not a routine aimed at being exposed externally as walrcv_send or
> walrcv_receive. I would recommend changing the name, to something like
> waitForWALStream or similar.

Done (as "wait_for_wal_stream", following the case convention of nearby stuff).

> Should we worried about potential backward-incompatibilities with the
> new return values of walrcv_receive?

There are three changes to the walrcv_receive interface:

1. It now takes a latch pointer, which may be NULL.

2. If the latch pointer is non-NULL, the existing function might
return a new sentinel value WALRCV_RECEIVE_LATCH_SET. (The
pre-existing sentinel value -1 is still in use and has the same value
and meaning as before, but it now has a name:
WALRCV_RECEIVE_COPY_ENDED.)

3. It will no longer return when the process is signalled (a latch
should be used to ask it to return instead).

Effectively, any client code would need to change at least to add NULL
or possibly a latch if it needs to ask it to return, and any
alternative implementation of the WAL receiver interface would need to
use WaitEventSet (or WaitLatchOrSocket) as its event loop instead of
whatever it might be using now so that it can detect a latch's state.
But in any case, any such code would fail to compile against 9.6 due
to the new argument, and then you'd only be able to get the new return
value if you asked for it by passing in a latch. What affected code
are we aware of -- either users of libpqwalreceiver.so or other WAL
receiver implementations?

> Do you have numbers to share regarding how is performing the
> latch-based approach and the approach that used SIGUSR2 when
> remote_apply is used?

I couldn't measure any significant change (Linux, all on localhost, 128 cores):

pgbench -c 1 -N bench2 -T 600

0001-remote-apply-v5.patch (signals), remote_apply -> 449 TPS
0001-remote-apply-v6.patch (latches), remote_apply -> 452 TPS

pgbench -c 64 -j 32 -N bench2 -T 600

0001-remote-apply-v5.patch (signals), remote_apply -> 8536 TPS
0001-remote-apply-v6.patch (latches), remote_apply -> 8534 TPS

Incidentally, I also did some testing on what happens when you signal
a process that is busily writing and fsyncing. I tested a few
different kernels, write sizes and disk latencies and saw that things
were fine on all of them up to 10k signals/sec but after that some
systems' fsync performance started to reduce. Only Linux on Power was
still happily fsyncing at around 3/4 of full speed when signalled with
a 2MHz+ tight kill loop (!), while FreeBSD and Linux on Intel weren't
able to make much progress at all under such adversity. So I suppose
that if you could get the commit rate up into 5 figures you might be
able to measure an improvement for the latch version due to
latch-collapsing, though I noticed a large amount of signal-collapsing
going on at the OS level on all systems I tested anyway, so maybe it
wouldn't make a difference. I attach that test program for interest.

Also, I updated the comment for the declaration of the latch in
walreceiver.h to say something about the new usage.

New patch series attached.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
0001-remote-apply-v7.patch application/octet-stream 28.0 KB
0002-replay-lag-v7.patch application/octet-stream 24.5 KB
0003-refactor-syncrep-exit-v7.patch application/octet-stream 4.6 KB
0004-causal-reads-v7.patch application/octet-stream 74.3 KB
interrupting_cow.c text/x-csrc 2.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2016-03-26 23:15:22 Re: WIP: Detecting SSI conflicts before reporting constraint violations
Previous Message Ashutosh Sharma 2016-03-26 21:04:32 Re: Performance degradation in commit 6150a1b0