Re: pg_standby: Question about truncation of trigger file in fast failover

From: Neil Thombre <nt(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_standby: Question about truncation of trigger file in fast failover
Date: 2014-02-21 00:02:26
Message-ID: CACqBBXHrJ3Gr6zjO3BrAnsyeL=8T7L1LR=PGg9aebOjyHwjDyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 19, 2014 at 1:42 PM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 02/19/2014 11:15 PM, Neil Thombre wrote:
>
>> And that is where I have a question. I noticed that in pg_standby.c when
>> we
>> detect the word "fast" in the trigger file we truncate the file.
>>
>> https://github.com/postgres/postgres/blob/REL9_1_11/
>> contrib/pg_standby/pg_standby.c#L456
>>
>> There is also a comment above it about not "upsetting" the server.
>>
>> https://github.com/postgres/postgres/blob/REL9_1_11/
>> contrib/pg_standby/pg_standby.c#L454
>>
>> What is the purpose of truncating the file? To do a smart failover once
>> you
>> come out of standby? But, when I look at xlog.c, when we come out of
>> standby due to a failure returned by restore_command, we call
>> CheckForStandbyTrigger() here:
>>
>> https://github.com/postgres/postgres/blob/REL9_1_11/src/
>> backend/access/transam/xlog.c#L10441
>>
>> Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
>> the debugger that the unlinking happens before xlog.c makes a call to the
>> next restore_command. So, what is the reason for truncating the "fast"
>> word from the trigger file if the file is going to be deleted soon after
>> it
>> is discovered? How will we "upset" the server if we don't?
>>
>
> At end-of-recovery, the server will fetch again the last WAL file that was
> replayed. If it can no longer find it, because restore_command now returns
> an error even though it succeeded for the same file few seconds earlier, it
> will throw an error and refuse to start up.
>

The restore_command returns error exactly once in my setup. So the next
time around, it does go back and is able to fetch the last segment
successfully applied. Let me go through the steps:

<New restore_command> = ! fgrep -qsi fast <trigger_file> && <Old
restore_command>

1. Until it finds "fast" in the trigger file it will continue running the
(old) restore_command. It applies 0000000300000C3100000099 successfully
from the archive.
2. I echo "fast" >> trigger_file
3. The next restore_command returns failure (because of the first part of
my &&) so it never ends up applying the next segment it was supposed to
apply,i.e., 0000000300000C310000009A
4. Db comes out of standby and checks for trigger file in
CheckForStandbyTrigger() that unlinks it. Now the trigger file is gone!
5. Next, it tries to read the last applied segment 0000000300000C3100000099
again - the new restore_command WILL NOT return failure because the trigger
file is gone and the first part (before the &&) is true and it will run the
old restore_command and try get the same file 0000000300000C3100000099
6. It is during reapplying this file that I get the following error:

Feb 7 00:37:45 XXXXXXXXXXXXXXXXXXXXXXXX LOG: restored log file
"0000000300000C3100000099" from archive

Feb 7 00:37:45 XXXXXXXXXXXXXXXXXXXXXXXX FATAL: WAL ends before consistent
recovery point

This error comes from:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L6782-L6783

Therefore, I feel that something was amiss in my setup. And I wanted to
understand the motive/tribal-knowledge behind the truncation part of
pg_standby's fast failover so as not to upset the server. In other words, I
have a feeling that by not truncating the trigger file I am inadvertently
upsetting the server which is the cause of my FATAL error.

>
> That's the way it used to be until 9.2, anyway. In 9.2, the behavior was
> changed, so that the server keeps all the files restored from archive, in
> pg_xlog, so that it can access them again. I haven't tried, but it's
> possible that the truncation is no longer necessary. Try it, with 9.1 and
> 9.3, and see what happens.
>
>
It may very well be that 9.3 will not have this problem. I will definitely
try this out when I have a chance - this problem is pretty obscure and
rarely happens on our customer databases, therefore we just work with
whatever database version we have at the moment to get as much forensics
and try out any solutions.

> - Heikki
>

Thanks a lot for your help.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Harris 2014-02-21 00:27:28 Re: Minor performance improvement in transition to external sort
Previous Message Noah Misch 2014-02-20 23:42:12 Re: Draft release notes up for review