Hi,
we have set of clusters that run on pg14 (yes, I know, we are in process
of upgrading but it's complicated).
Recently we noticed that some of the DR nodes are lagging because they
get stuck on some wal files, and starup process, in pg_stat_activity
shows "IPC:RecoveryConflictSnapshot" as wait event.
The thing is that there are no othger connections doing anything to the
db. There are some idle monitor ones, running things like count(*) from
pg_stat_activity, every now and then, but we're talking about pg being
stuck on single wal up to an hour or so.
Stuck as in:
ps shows:
postgres: 14/main: startup recovering 00000003000065EA000000C6 waiting
What could it be, how to fix it?
Best regards,
depesz