Re: Broken hint bits (freeze)

From: Vladimir Borodin <root(at)simply(dot)name>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Broken hint bits (freeze)
Date: 2017-05-27 17:16:25
Message-ID: 060EBFE2-D33E-4CD2-AE58-499A22040F2F@simply.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> 27 мая 2017 г., в 19:56, Andres Freund <andres(at)anarazel(dot)de> написал(а):
>
> On 2017-05-27 19:48:24 +0300, Vladimir Borodin wrote:
>> Well, actually clean shutdown of master with exit code 0 from `pg_ctl
>> stop -m fast` guarantees that all WAL has been replicated to standby.
>
> It does not. It makes it likely, but the connection to the standby
> could be not up just then, you could run into walsender timeout, and a
> bunch of other scenarios.

AFAIK in this case exit code would not be zero. Even if archiver has not been able to archive all WALs before timeout for shutting down happened, exit code will not be zero.

>
>
>> But just in case we also check that "Latest checkpoint's REDO
>> location" from control file on old master after shutdown is less than
>> pg_last_xlog_replay_location() on standby to be promoted.
>
> The *redo* location? Or the checkpoint location itself? Because the
> latter is what needs to be *equal* than the replay location not less
> than. Normally there won't be other records inbetween, but that's not
> guaranteed.

I've asked about it some time ago [1]. In that case checkpoint location and redo location were equal after shutdown and last replay location on standby was higher on 104 bytes (the size of shutdown checkpoint record).

But we do check exactly redo location. Should we change it for checking checkpoint location?

[1] https://www.postgresql.org/message-id/A7683985-2EC2-40AD-AAAC-B44BD0F29723%40simply.name

>
>
>> And if something would go wrong in above logic, postgres will not let you attach old master as a standby of new master. So it is highly probable not a setup problem.
>
> There's no such guarantee. There's a bunch of checks that'll somewhat
> likely trigger, but nothing more than that.
>
> - Andres

--
May the force be with you…
https://simply.name

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-05-27 18:50:03 Re: Surjective functional indexes
Previous Message Andres Freund 2017-05-27 16:56:09 Re: Broken hint bits (freeze)