Re: Proposal: "Causal reads" mode for load balancing reads without stale data

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thom Brown <thom(at)linux(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date: 2016-03-28 12:56:28
Message-ID: CAEepm=0hg_FX7kdUhosTpJ_kPsUZw6k-7nuQNy-dGAOaetn_tA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 28, 2016 at 8:54 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> static void WalRcvQuickDieHandler(SIGNAL_ARGS);
>
> -
> static void
> ProcessWalRcvInterrupts(void)
> Noise here.

Fixed.

> + ret = WaitLatchOrSocket(latch, events, PQsocket(streamConn), timeout_ms);
> +
> + if (ret & WL_POSTMASTER_DEATH)
> + exit(0);
> Exiting within libpqwalreceiver.so is no good. I think that even in
> the case of a postmaster cleanup we should allow things to be cleaned
> up.

I agree. I suppose this is really a symptom of the problem you talked
about below: see response there.

> /*
> + * Wake up the walreceiver if it happens to be blocked in walrcv_receive,
> + * and tell it that a commit record has been applied.
> + *
> + * This is called by the startup process whenever interesting xlog records
> + * are applied, so that walreceiver can check if it needs to send an apply
> + * notification back to the master which may be waiting in a COMMIT with
> + * synchronous_commit = remote_apply.
> + */
> +void
> +WalRcvWakeup(void)
> +{
> + SetLatch(&WalRcv->latch);
> +}
> I think here that it would be good to add an assertion telling that
> this can just be called by the startup process while in recovery,
> WalRcv->latch is not protected by a mutex lock.
>
> +maximum of 'timeout' ms. If a message was successfully read, returns
> +its length. Otherwise returns 0 for timeout, WALRCV_RECEIVE_COPY_ENDED
> +for disconnection or WALRCV_RECEIVE_LATCH_SET. On success, a pointer
> Having an assigned constant name for timeout would be good for
> consistency with the rest.

Yeah, I guess it would have to mirror all the WL_XXX flags if we
continue down that path, but...

> I have been also thinking a lot about this patch, and the fact that
> the WAL receiver latch is being used within the internals of
> libpqwalreceiver has been bugging me a lot, because this makes the
> wait phase happening within the libpqwalreceiver depend on something
> that only the WAL receiver had a only control on up to now (among the
> things thought: having a second latch for libpqwalreceiver, having an
> event interface for libpqwalreceiver, switch libpq_receive into being
> asynchronous...).

Yeah, it bugs me too. Do you prefer this?

int walrcv_receive(char **buffer, int *wait_fd);

Return value -1 means end-of-copy as before, return value 0 means "no
data available now, please call me again when *wait_fd is ready to
read". Then walreceiver.c can look after the WaitLatchOrSocket call
and deal with socket readiness, postmaster death, timeout and latch,
and libpqwalreceiver.c doesn't know anything about all that stuff
anymore, but it is now part of the interface that it must expose a
file descriptor for readiness testing when it doesn't have data
available.

Please find attached a new patch series which does it that way.

> At the end, we need a way to allow the startup
> process to let the WAL receiver process know that it needs to be
> interrupted via shared memory, and that's the WAL receiver latch, the
> removal of epoll stuff cleans up some code at the end. So it seems
> that I finally made my mind on 0001 and it looks good to me except the
> small things mentioned above.

Thanks!

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
0001-remote-apply-v8.patch application/octet-stream 24.3 KB
0002-replay-lag-v8.patch application/octet-stream 24.5 KB
0003-refactor-syncrep-exit-v8.patch application/octet-stream 4.6 KB
0004-causal-reads-v8.patch application/octet-stream 74.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2016-03-28 13:08:26 Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Previous Message Alexander Korotkov 2016-03-28 12:46:43 Re: Move PinBuffer and UnpinBuffer to atomics