Re: Measuring replay lag

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Measuring replay lag
Date: 2017-03-13 23:39:58
Message-ID: CAEepm=2RujYcvTRRp6fmUVr7h8St+FrzeH2=8nJ03CZFQz0xMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Please see separate replies to Simon and Craig below.

On Sun, Mar 5, 2017 at 8:38 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 1 March 2017 at 10:47, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> I do see why a new user trying this feature for the first time might
>> expect it to show a lag of 0 just as soon as sent LSN =
>> write/flush/apply LSN or something like that, but after some
>> reflection I suspect that it isn't useful information, and it would be
>> smoke and mirrors rather than real data.
>
> Perhaps I am misunderstanding the way it works.
>
> If the last time WAL was generated the lag was 14 secs, then nothing
> occurs for 2 hours aftwards AND all changes have been successfully
> applied then it should not continue to show 14 secs for the next 2
> hours.
>
> IMHO the lag time should drop to zero in a reasonable time and stay at
> zero for those 2 hours because there is no current lag.
>
> If we want to show historical lag data, I'm supportive of the idea,
> but we must report an accurate current value when the system is busy
> and when the system is quiet.

Ok, I thought about this for a bit and have a new idea that I hope
will be more acceptable. Here are the approaches considered:

1. Show the last measured lag times on a completely idle system until
such time as the standby eventually processes more lag, as I had it in
the v5 patch. You don't like that and I admit that it is not really
satisfying (even though I know that real Postgres systems alway
generate more WAL fairly soon even without user sessions, it's not
great that it depends on an unknown future event to clear the old
data).

2. Recognise when the last reported write/flush/apply LSN from the
standby == end of WAL on the sending server, and show lag times of
00:00:00 in all three columns. I consider this entirely bogus: it's
not an actual measurement that was ever made, and on an active system
it would flip-flop between real measurements and the artificial
00:00:00. I do not like this.

3. Recognise the end of WAL as above, but show SQL NULL in the
columns. Now we don't show an entirely bogus number like 00:00:00 but
we still have the flickering/flip-flopping between nothing and a
measured number during typical use (ie during short periods of
idleness between writes). I do not like this.

4. Somehow attempt to measure the round trip time for a keep-alive
message or similar during idle periods. This means that we would be
taking something that reflects one component of the usual lag
measurements, namely network transfer, but I think we would making
false claims when we show that in columns that report measured write
time, flush time and apply time. I do not like this.

5. The new proposal: Show only true measured write/flush/apply data,
as in 1, but with a time limit. To avoid the scenario where we show
the same times during prolonged periods of idleness, clear the numbers
like in option 3 after a period of idleness. This way we avoid the
dreaded flickering/flip-flopping. A natural time to do that is when
wal_receiver_status_interval expires on idle systems and defaults to
10 seconds.

Done using approach 5 in the attached version. Do you think this is a
good compromise? No bogus numbers, only true measured
write/flush/apply times, but a time limit on 'stale' lag information.

On Mon, Mar 6, 2017 at 3:22 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 5 March 2017 at 15:31, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> What we want from this patch is something that works for both, as much
>> as that is possible.
>
> If it shows a sawtooth pattern for flush lag, that's good, because
> it's truthful. We can only flush after we replay commit, therefore lag
> is always going to be sawtooth, with tooth size approximating xact
> size and the baseline lag trend representing any sustained increase or
> decrease in lag over time.
>
> This would be extremely valuable to have.

Thanks for your detailed explanation Craig. (I also had a chat with
Craig about this off-list.) Based on your feedback, I've added
support for reporting lag from logical replication, warts and all.

Just a thought: perhaps logical replication could consider
occasionally reporting a 'write' position based on decoded WAL written
to reorder buffers (rather than just reporting the apply LSN as write
LSN at commit time)? I think that would be interesting information in
its own right, but would also provide more opportunities to
interpolate the flush/apply sawtooth for large transactions.

Please find a new version attached.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
replication-lag-v6.patch application/octet-stream 20.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-03-13 23:59:42 Re: Radix tree for character conversion
Previous Message Andrew Dunstan 2017-03-13 23:26:20 Re: Need a builtin way to run all tests faster manner