Re: New trigger option of pg_standby

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New trigger option of pg_standby
Date: 2009-04-13 05:52:29
Message-ID: 3f0b79eb0904122252t1f4c5860v919d199f399ccf72@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Sat, Apr 11, 2009 at 1:31 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> Fujii-san,
>
> I like the new patch using the content of the file to determine the
> mode. Much easier to use at failover time.
>
> On Fri, 2009-04-10 at 12:47 +0900, Fujii Masao wrote:
>
>> > One problem with this patch is that in smart mode, the trigger file is not
>> > deleted. That's different from current pg_standby behavior, and makes
>> > accidental failovers after one failover more likely.
>>
>> Yes, it's because pg_standby cannot be sure when the trigger file
>> can be removed in smart mode. If the trigger file is deleted as soon
>> as it's found, just like in fast mode, pg_standby may keep waiting
>> for WAL file again.
>
> My understanding of smart mode is fairly simple:
>
>   if (triggered)
>   {
>        if (smartMode && nextWALfile+1 exists)
>                exit(0);
>        else
>        {
>                delete trigger file
>                exit(1);
>        }
>   }
>
> If you perform a file lookahead (the +1) as shown above then you avoid
> the problem Heikki observes.

Thanks for the suggestion!

A lookahead (the +1) may have pg_standby get stuck as follows.
Am I missing something?

1. the trigger file containing "smart" is created.
2. pg_standby is executed.
2-1. nextWALfile is restored.
2-2. the trigger file is deleted because nextWALfile+1 doesn't exist.
3. the restored nextWALfile is applied.
4. pg_standby is executed again to restore nextWALfile+1.
5. pg_standby gets stuck because the trigger file and nextWALfile+1
don't exist.

But, a lookahead nextWALfile seems to work fine.

if (triggered)
{
if (smartMode && nextWALfile exists)
exit(0)
else
{
delete trigger file
exit(1)
}
}

1. the trigger file containing "smart" is created.
2. pg_standby is executed.
2-1. nextWALfile is restored.
3. the restored nextWALfile is applied.
4. pg_standby is executed again to restore nextWALfile+1.
4-1. the trigger file is deleted because nextWALfile+1 doesn't exist.
5. the startup process fails to read nextWALfile+1.
6. pg_standby is executed again to re-fetch nextWALfile.
6-1. nextWALfile is restored.
6-2. pg_standby doesn't get stuck because nextWALfile exists.

Furthermore, pg_standby may have to check if nextWALfile exists
not only in archiveLocation but also in pg_xlog. Because, when
pg_xlog of the primary server can be read at failover, WAL files
in it may be copied to pg_xlog of the standby server to be applied.
(but, not sure if it's better to copy such files to pg_xlog instead of
archiveLocation in this case).

Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2009-04-13 10:13:15 Re: Solution of the file name problem of copy on windows.
Previous Message Abhijit Menon-Sen 2009-04-13 03:48:38 Re: [PATCH] Add a test for pg_get_functiondef()