Re: Proposal: "Causal reads" mode for load balancing reads without stale data

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date: 2015-11-16 10:44:29
Message-ID: CANP8+j+BgzJ0b3-M8RmYBxHzwCK3C0UR4Zy27uBQNFo7KKa_qA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15 November 2015 at 14:50, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sun, Nov 15, 2015 at 5:41 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> wrote:
> > Hmm, if that's where we're at, I'll summarize my thoughts.
> >
> > All of this discussion presupposes we are distributing/load balancing
> > queries so that reads and writes might occur on different nodes.
>
> Agreed. I think that's a pretty common pattern, though certainly not
> the only one.
>

It looks to me this functionality is only of use in a pooler. Please
explain how else this would be used.

> > Your option (2) is wider but also worse in some ways. It can be
> implemented
> > in a pooler.
> >
> > Your option (3) doesn't excite me much. You've got a load of stuff that
> > really should happen in a pooler. And at its core we have
> synchronous_commit
> > = apply but with a timeout rather than a wait.
>
> I don't see how either option (2) or option (3) could be implemented
> in a pooler. How would that work?
>

My starting thought was that (1) was the only way forwards. Through
discussion, I now see that its not the best solution for the general case.

The pooler knows which statements are reads and writes, it also knows about
transaction boundaries, so it is possible for it to perform the waits for
either (2) or (3). The pooler *needs* to know which nodes it can route
queries to, so it looks to me that the pooler is the best place to put
waits and track status of nodes, no matter when we wait. I don't see any
benefit in having other nodes keep track of node status since that will
just replicate work that *must* be performed in the pooler.

I would like to see a load balancing pooler in Postgres.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2015-11-16 10:52:52 Re: Getting sorted data from foreign server for merge join
Previous Message Ashutosh Bapat 2015-11-16 09:47:57 Re: Getting sorted data from foreign server for merge join