Re: Synchronous replay take III

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Synchronous replay take III
Date: 2018-11-30 20:06:57
Message-ID: CA+q6zcV6463xcLXhN9F1p0vLie8JRs=2jZ27XDW5x_62SUesBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Nov 15, 2018 at 6:34 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > On Thu, Mar 1, 2018 at 10:40 AM Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> >
> > In previous threads[1][2][3] I called this feature proposal "causal
> > reads". That was a terrible name, borrowed from MySQL. While it is
> > probably a useful term of art, for one thing people kept reading it as
> > "casual"

Yeah, that was rather annoying that I couldn't get rid of this while playing
with the "take II" version :)

> To be clear what did you mean read-mostly workloads?
>
> I think there are two kind of reads on standbys: a read happend after
> writes and a directly read (e.g. reporting). The former usually
> requires the causal reads as you mentioned in order to read its own
> writes but the latter might be different: it often wants to read the
> latest data on the master at the time. IIUC even if we send a
> read-only query directly to a synchronous replay server we could get a
> stale result if the standby delayed for less than
> synchronous_replay_max_lag. So this synchronous replay feature would
> be helpful for the former case(i.e. a few writes and many reads wants
> to see them) whereas for the latter case perhaps the keeping the reads
> waiting on standby seems a reasonable solution.
>
> Also I think it's worth to consider the cost both causal reads *and*
> non-causal reads.
>
> I've considered a mixed workload (transactions requiring causal reads
> and transactions not requiring it) on the current design. IIUC the
> current design seems like that we create something like
> consistent-reads group by specifying servers. For example, if a
> transaction doesn't want to causality read it can send query any
> server with synchronous_replay = off but if it wants, it should select
> a synchronous replay server. It also means that client applications or
> routing middlewares such as pgpool is required to be aware of
> available synchronous replay standbys. That is, this design would cost
> the read-only transactions requiring causal reads. On the other hand,
> in token-based causal reads we can send read-only query any standbys
> if we can wait for the change to be replayed. Of course if we don't
> wait forever we can timeout and switch to either another standby or
> the master to execute query but we don't need to choose a server of
> standby servers.

Unfortunately, cfbot says that patch can't be applied without conflicts, could
you please post a rebased version and address commentaries from Masahiko?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2018-11-30 20:08:14 Re: Range phrase operator in tsquery
Previous Message Fabien COELHO 2018-11-30 20:04:11 Re: pgbench doc fix