Re: New trigger option of pg_standby

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New trigger option of pg_standby
Date: 2009-04-13 11:30:25
Message-ID: 3f0b79eb0904130430w1374d86bg7c8f973ef75f9c15@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Apr 13, 2009 at 7:21 PM, Guillaume Smet
<guillaume(dot)smet(at)gmail(dot)com> wrote:
> On Mon, Apr 13, 2009 at 7:52 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> 1. the trigger file containing "smart" is created.
>> 2. pg_standby is executed.
>>    2-1. nextWALfile is restored.
>>    2-2. the trigger file is deleted because nextWALfile+1 doesn't exist.
>> 3. the restored nextWALfile is applied.
>> 4. pg_standby is executed again to restore nextWALfile+1.
>
> I don't think it should happen. IMHO, it's an acceptable compromise to
> replay all the WAL files present when I created the trigger file. So
> if I have the smart shutdown trigger file and I don't have any
> nextWALfile+1, I can remove the trigger file and stop the recovery:
> pg_standby won't be executed again after that, even if a nextWALfile+1
> appeared while replaying the previous WAL file.

The scenario which I described is not related to whether the
nextWALfile+1 exists or not. To clarify the detail of it;

If pg_standby restores nextWALfile, deletes the trigger file and
exits with 1 (i.e. tell the end of recovery to the startup process),
the startup process considers that pg_standby failed,
and tries to read the nextWALfile in pg_xlog instead of the
restored file named "RECOVERYXLOG". This is undesirable
behavior because some transactions would be lost if nextWALfile
in pg_xlog doesn't exist. So, exit(0) should be called when
nextWALfile exists.

On the other hand, if pg_standby restores the nextWALfile,
deletes the trigger file and calls exit(0), the startup process
replays the restored file and tries to read the nextWALfile+1
because it doesn't know if the nextWALfile is the last valid WAL
file. So, pg_standby may be executed again even after the trigger
file is deleted.

Am I missing something?

> That said, stupid question: do we have a way to know the nextWALfile+1
> name to test if it exists? nextWALfile is transmitted through the
> restore_command API and I'm wondering if we can have nextWALfile+1
> name without changing the restore_command API.

Probably Yes; the following three steps are required, I think.
- Get the timeline, logid and segid from the name of the nextWALfile.
- Increment the logid and segid pair using NextLogSeg macro.
- Calculate the name of the nextWALfile+1 using XLogFileName macro.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-04-13 14:14:37 Re: Solution of the file name problem of copy on windows.
Previous Message Guillaume Smet 2009-04-13 10:21:41 Re: New trigger option of pg_standby