Re: New trigger option of pg_standby

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Guillaume Smet <guillaume(dot)smet(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New trigger option of pg_standby
Date: 2009-04-23 07:49:13
Message-ID: 49F01D79.8020805@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fujii Masao wrote:
> On Wed, Apr 22, 2009 at 4:27 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> Fujii Masao wrote:
>>> On Tue, Apr 14, 2009 at 2:41 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
>>> wrote:
>>>> I'd like to propose another simple idea; pg_standby deletes the
>>>> trigger file *whenever* the nextWALfile is a timeline history file.
>>>> A timeline history file is restored at the end of recovery, so it's
>>>> guaranteed that the trigger file is deleted whether nextWALfile
>>>> exists or not.
>>>>
>>>> A timeline history file is restored also at the beginning of
>>>> recovery, so the accidentally remaining trigger file is deleted
>>>> in early warm-standby as a side-effect of this idea.
>>> Here is the revised patch as above.
>>>
>>> If you notice something, please feel free to comment.
>> Ok, looking at this in more detail now. A couple of small things:
>>
>> We mustn't remove the trigger file immediately even in fast mode. As noted
>> elsewhere in this thread, we have the same bug in fast mode where pg_standby
>> gets stuck if you copy WAL files directly into pg_xlog.
>
> Yes, there is the same problem also in fast mode. But, in fast
> mode, the trigger file has to be deleted immediately if it's found.
> Otherwise, recovery may fail as follows.
>
> 1. pg_standby finds the trigger file for fast mode, and returns
> non-zero without deleting the trigger file.
> 2. the startup process tries to read the WAL file from pg_xlog,
> but it's not found.
> 3. the startup process tries to restore the last applied WAL file
> using pg_standby.
> 4. (Again) pg_standby finds the trigger file for fast mode, and
> returns non-zero without deleting the trigger file.
> 5. the startup process tries to read the last applied WAL file,
> but it's not found.
> (though the last applied file was of course restored before,
> the restored one cannot be read here)
> 6. recovery fails because the last applied WAL file cannot be
> read.
>
> On the other hand, if pg_standby returns 0 also in fast mode
> when the requested file and trigger file exist, ISTM that there
> is not much difference between fast and smart mode; also in
> fast mode, all the available WAL files would be applied.

Hmm, pg_standby could truncate the trigger file, so that it acts like a
smart trigger in the subsequent pg_standby invocations. Assuming the
postgres user has write access to it; it probably does because it can
delete it, but conceivably it has only read access on the file but write
access on the directory it's in.

This is getting complicated, though. I guess it would be enough to
document that you mustn't copy any extra files into pg_xlog if you use a
fast trigger.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2009-04-23 08:13:04 Re: citex regression fails with de.UTF8 locale
Previous Message Heikki Linnakangas 2009-04-23 07:22:12 Re: citex regression fails with de.UTF8 locale