Re: Proposal: "Causal reads" mode for load balancing reads without stale data

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date: 2015-11-12 12:16:16
Message-ID: CANP8+jKUhV9V7tqgQkBB+NkcJdu2-yD10FfXaTXUVxZd=e+PNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11 November 2015 at 09:22, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:

> 1. Reader waits with exposed LSNs, as Heikki suggests. This is what
> BerkeleyDB does in "read-your-writes" mode. It means that application
> developers have the responsibility for correctly identifying transactions
> with causal dependencies and dealing with LSNs (or whatever equivalent
> tokens), potentially even passing them to other processes where the
> transactions are causally dependent but run by multiple communicating
> clients (for example, communicating microservices). This makes it
> difficult to retrofit load balancing to pre-existing applications and (like
> anything involving concurrency) difficult to reason about as applications
> grow in size and complexity. It is efficient if done correctly, but it is
> a tax on application complexity.
>

Agreed. This works if you have a single transaction connected thru a pool
that does statement-level load balancing, so it works in both session and
transaction mode.

I was in favour of a scheme like this myself, earlier, but have more
thoughts now.

We must also consider the need for serialization across sessions or
transactions.

In transaction pooling mode, an application could get assigned a different
session, so a token would be much harder to pass around.

2. Reader waits for a conservatively chosen LSN. This is roughly what
> MySQL derivatives do in their "causal_reads = on" and "wsrep_sync_wait =
> 1" modes. Read transactions would start off by finding the current end
> of WAL on the primary, since that must be later than any commit that
> already completed, and then waiting for that to apply locally. That means
> every read transaction waits for a complete replication lag period,
> potentially unnecessarily. This is tax on readers with unnecessary waiting.
>

This tries to make it easier for users by forcing all users to experience a
causality delay. Given the whole purpose of multi-node load balancing is
performance, referencing the master again simply defeats any performance
gain, so you couldn't ever use it for all sessions. It could be a USERSET
parameter, so could be turned off in most cases that didn't need it. But
its easier to use than (1).

Though this should be implemented in the pooler.

3. Writer waits, as proposed. In this model, there is no tax on readers
> (they have zero overhead, aside from the added complexity of dealing with
> the possibility of transactions being rejected when a standby falls behind
> and is dropped from 'available' status; but database clients must already
> deal with certain types of rare rejected queries/failures such as
> deadlocks, serialization failures, server restarts etc). This is a tax on
> writers.
>

This would seem to require that all readers must first check with the
master as to which standbys are now considered available, so it looks like
(2).

The alternative is that we simply send readers to any standby and allow the
pool to work out separately whether the standby is still available, which
mostly works, but it doesn't handle sporadic slow downs on particular
standbys very well (if at all).

I think we need to look at whether this does actually give us anything, or
whether we are missing the underlying Heisenberg reality.

More later.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-12 12:35:57 Re: Proposing COPY .. WITH PERMISSIVE
Previous Message Alexander Korotkov 2015-11-12 11:49:43 Re: WIP: Rework access method interface