Re: Broken hint bits (freeze)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Borodin Vladimir <root(at)simply(dot)name>
Subject: Re: Broken hint bits (freeze)
Date: 2017-05-24 12:44:32
Message-ID: CA+TgmoY9DGeoAvRGjVFurYUUnGqXtEKi6-nScr0BkSeLKxovBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 24, 2017 at 7:27 AM, Dmitriy Sarafannikov
<dsarafannikov(at)yandex(dot)ru> wrote:
> It seems like replica did not replayed corresponding WAL records.
> Any thoughts?

heap_xlog_freeze_page() is a pretty simple function. It's not
impossible that it could have a bug that causes it to incorrectly skip
records, but it's not clear why that wouldn't affect many other replay
routines equally, since the pattern of using the return value of
XLogReadBufferForRedo() to decide what to do is widespread.

Can you prove that other WAL records generated around the same time as
the freeze record *were* replayed on the master? If so, that proves
that this isn't just a case of the WAL never reaching the standby.
Can you look at the segment that contains the relevant freeze record
with pg_xlogdump? Maybe that record is messed up somehow.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-24 12:52:17 Re: Is it possible to get query_string value in an event trigger?
Previous Message Robert Haas 2017-05-24 12:29:15 Re: wal_level > WAL_LEVEL_LOGICAL