Re: Proposal: "Causal reads" mode for load balancing reads without stale data

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thom Brown <thom(at)linux(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Date: 2016-03-09 23:35:53
Message-ID: CAEepm=3NF=7eLkVR2fefVF9bg6RxpZXoQFmOP3RWE4r4iuO7vg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 9, 2016 at 6:07 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Agreed. I have split this work up into four patches which apply on
> top of each other, and provide something (hopefully) useful at each
> stage.

Yesterday's patch set doesn't apply after commit
b6fb6471f6afaf649e52f38269fd8c5c60647669 which added a neighbouring
line in pg_proc.h, so here's a new set that does.

I looked into COMMIT PREPARED replay feedback and realised that it
doesn't need any special handling beyond what is already in
xact_redo_commit. However, I see now that I *do* need to do something
when replaying PREPARE TRANSACTION, as you said. Not for causal reads
though -- it doesn't care about an operation with no visible effect --
but for synchronous_commit = remote_apply. I am thinking about how to
fix that. (Have PREPARE TRANSACTION wait only for flush even though
you asked for remote_apply? Add a 'feedback please' bit to
XLOG_XACT_PREPARE records? Always send feedback when replaying
XLOG_XACT_PREPARE records?)

The following rough ballpark numbers (generated with the attached test
client) aren't very scientific or in any way indicative of real
conditions (it's a bunch of clusters running on my laptop), but they
demonstrate that two-phase commit apply feedback is being reported to
the primary straight away in causal reads mode (otherwise the 2PC
causal reads number wouldn't be so high).

Sequential UPDATE in simple transaction:

single node: ~2700 TPS
sync rep remote flush: ~2500 TPS
sync rep remote apply: ~2000 TPS
causal reads (4 standbys): ~1600 TPS

Sequential UPDATE in two phase commit transaction:

single node: ~900 TPS
sync rep remote flush: ~900 TPS
sync rep remote apply: (hangs)
causal reads (4 standbys): ~900 TPS

(The actual numbers are pretty noisy. I've taken medians of 3 and
rounded to the nearest 100, and I guess the replication overheads are
not magnified as much in the case of the slower 2PC workload and then
get lost in the noise. With --check you can verify that the 2PC
transaction is not always visible on the standby it connects to until
you enable --causal-reads, so I don't think it's just broken!)

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
0001-remote-apply-v2.patch application/octet-stream 16.5 KB
0002-replay-lag-v2.patch application/octet-stream 25.5 KB
0003-refactor-syncrep-exit-v2.patch application/octet-stream 4.6 KB
0004-causal-reads-v2.patch application/octet-stream 73.7 KB
test-causal-reads.c text/x-csrc 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2016-03-09 23:49:09 Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”
Previous Message Petr Jelinek 2016-03-09 23:25:21 Re: Timeline following for logical slots