Re: Measuring replay lag

From: David Steele <david(at)pgmasters(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Measuring replay lag
Date: 2017-03-21 17:32:58
Message-ID: fecaabd1-0c41-ad8c-37f1-983874817b03@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Thomas,

On 3/15/17 8:38 PM, Simon Riggs wrote:
> On 16 March 2017 at 08:02, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>
>> I agree that these states exist, but we disagree on what 'lag' really
>> means, or, rather, which of several plausible definitions would be the
>> most useful here.
>>
>> My proposal is that the *_lag columns should always report how long it
>> took for recently written, flushed and applied WAL to be written,
>> flushed and applied (and for the primary to know about it). By this
>> definition, sent LSN = applied LSN is not a special case: we simply
>> report how long that LSN took to be written, flushed and applied.
>>
>> Your proposal is that the *_lag columns should report how far in the
>> past the standby is at each of the three stages with respect to the
>> current end of WAL. By this definition when sent LSN = applied LSN we
>> are currently in the 'A' state meaning 'caught up' and should show
>> 00:00:00.
>
> I accept your proposal for how we handle these, on condition that you
> write up some docs that explain the subtle difference between the two,
> so we can just show people the URL. That needs to explain clearly the
> difference in an impartial way between "what is the most recent lag
> measurement" and "how long until we are caught up" as possible
> intrepretations of these values. Thanks.

This thread has been idle for six days. Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".

Thanks,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2017-03-21 17:37:17 Re: PATCH: recursive json_populate_record()
Previous Message David Steele 2017-03-21 17:29:26 Re: GUC for cleanup indexes threshold.