Re: WAL replay issue from 9.6.8 to 9.6.10

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Dave Peticolas <dave(at)krondo(dot)com>
Cc: Alexander Kukushkin <cyberdemn(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: WAL replay issue from 9.6.8 to 9.6.10
Date: 2018-08-29 20:50:03
Message-ID: 20180829205003.GD5903@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Aug 29, 2018 at 09:15:29AM -0700, Dave Peticolas wrote:
> Oh, perhaps I do, depending on what you mean by worker. There are a couple
> of periodic processes that connect to the server to obtain metrics. Is that
> what is triggering this issue? In my case I could probably suspend them
> until the replay has reached the desired point.

That would be it. How do you decide when those begin to run and connect
to Postgres. Do you use pg_isready or similar in a loop for sanity
checks?

> I have noticed this behavior in the past but prior to 9.6.10 restarting the
> server would fix the issue. And the replay always seemed to reach a point
> past which the problem would not re-occur.

You are picking my interest here. Did you actually see the same
problem? In 9.6.10 what happens is that I have tightened the consistent
point checks and logic so as inconsistent page issues would actually
show up when they should, and that those become reproducible so as we
can track down any rogue WAL record or inconsistent behavior.
--
Michael

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dave Peticolas 2018-08-30 03:19:07 Re: WAL replay issue from 9.6.8 to 9.6.10
Previous Message Bruce Momjian 2018-08-29 19:06:35 Re: pg_upgrade fails saying function unaccent(text) doesn't exist