Non-null values of recovery functions after promote or crash of primary

From: Martín Marqués <martin(at)2ndquadrant(dot)com>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Non-null values of recovery functions after promote or crash of primary
Date: 2019-10-05 11:43:03
Message-ID: 576b29e5-5edf-b4b7-bb73-7ffad63535e2@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi,

Yesterday we (that's me and my colleague Ricardo Gomez) were working on
an issue where a monitoring script was returning increasing lag
information on a primary instead of a NULL value.

The query used involved the following functions (the function was
amended to work-around the issue I'm reporting here):

pg_last_wal_receive_lsn()
pg_last_wal_replay_lsn()
pg_last_xact_replay_timestamp()

Under normal circumstances we would expect to receive NULLs from all
three functions on a primary node, and code comments back up my thoughts.

The problem is, what if the node is a standby which was promoted without
restarting, or that had to perform crash recovery?

So during the time it's recovering the values in ` XLogCtl` are updated
with recovery information, and once the recovery finishes, due to crash
recovery reaching a consistent state, or a promotion of a standby
happening, those values are not reset to startup defaults.

That's when you start seeing non-null values returned by
`pg_last_wal_replay_lsn()`and `pg_last_xact_replay_timestamp()`.

Now, I don't know if we should call this a bug, or an undocumented
anomaly. We could fix the bug by resetting the values from ` XLogCtl`
after finishing recovery, or document that we might see non-NULL values
in certain cases.

Regards,

--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-10-05 15:30:39 Re: BUG #16036: Segmentation fault while doing an update
Previous Message 张舒燕 2019-10-05 11:33:22 Reply to the "Write skew anmalies are found in SERIALIZABLE isolation"

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikolay Shaplov 2019-10-05 15:12:26 Re: [PATCH] Do not use StdRdOptions in Access Methods
Previous Message Amit Kapila 2019-10-05 11:21:50 Re: [HACKERS] Block level parallel vacuum