Re: pg_stat_replication lag fields return non-NULL values even with NULL LSNs

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Subject: Re: pg_stat_replication lag fields return non-NULL values even with NULL LSNs
Date: 2019-08-13 02:19:53
Message-ID: 20190813021953.GB2551@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 13, 2019 at 11:15:42AM +1200, Thomas Munro wrote:
> Hmm. It's working as designed, but indeed it's not very newsworthy
> information in this case. If you run pg_receivewal --synchronous then
> you get sensible looking flush_lag times. Without that, flush_lag
> only goes up, and of course replay_lag only goes up, so although it's
> telling the truth, I think your proposal makes sense.

Thanks!

> One question I had is what would happen with your patch without
> --synchronous, once it flushes a whole file and opens a new one; I
> wondered if your new boring-information-hiding behaviour would stop
> working after one segment file because of that.

Indeed.

> I tested that and the column remains NULL when we move to a new
> file, so that's good.

Thanks for looking.

> One thing I noticed in passing is that you always get the same times
> in the write_lag and flush_lag columns, in --synchronous mode, and the
> times updates infrequently. That's not the case with regular
> replicas; I suspect there is a difference in the time and frequency of
> replies sent to the server, which I guess might make synchronous
> commit a bit "lumpier", but I didn't dig further today.

The messages are sent by pg_receivewal via sendFeedback() in
receivelog.c. It gets triggered for the --synchronous case once a
flush is done (but you are not surprised by my reply here, right!),
and most likely the matches you are seeing some from the messages sent
at the beginning of HandleCopyStream() where the flush and write
LSNs are equal. This code behaves as I would expect based on your
description and a read of the code I have just done to refresh my
mind, but we may of course have some issues or potential
improvements.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-08-13 02:46:36 Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Previous Message Michael Paquier 2019-08-13 01:58:25 Re: Regression test failure in regression test temp.sql