Skip site navigation (1) Skip section navigation (2)

Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Date: 2010-03-18 14:27:59
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-committerspgsql-docspgsql-hackers
On Wed, Mar 17, 2010 at 7:35 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Fujii Masao wrote:
>> I found another missing feature in new file-based log shipping (i.e.,
>> standby_mode is enabled and 'cp' is used as restore_command).
>> After the trigger file is found, the startup process with pg_standby
>> tries to replay all of the WAL files in both pg_xlog and the archive.
>> So, when the primary fails, if the latest WAL file in pg_xlog of the
>> primary can be read, we can prevent the data loss by copying it to
>> pg_xlog of the standby before creating the trigger file.
>> On the other hand, the startup process with standby mode doesn't
>> replay the WAL files in pg_xlog after the trigger file is found. So
>> failover always causes the data loss even if the latest WAL file can
>> be read from the primary. And if the latest WAL file is copied to the
>> archive instead, it can be replayed but a PANIC error would happen
>> because it's not filled.
>> We should remove this restriction?
> Looking into this, I realized that we have a bigger problem related to
> this. Although streaming replication stores the streamed WAL files in
> pg_xlog, so that they can be re-replayed after a standby restart without
> connecting to the master, we don't try to replay those either. So if you
> restart standby, it will fail to start up if the WAL it needs can't be
> found in archive or by connecting to the master. That must be fixed.

I agree that this is a bigger problem. Since the standby always starts
walreceiver before replaying any WAL files in pg_xlog, walreceiver tries
to receive the WAL files following the REDO starting point even if they
have already been in pg_xlog. IOW, the same WAL files might be shipped
from the primary to the standby many times. This behavior is unsmart,
and should be addressed.


Fujii Masao
NTT Open Source Software Center

In response to


pgsql-docs by date

Next:From: Tim LandscheidtDate: 2010-03-18 15:52:31
Subject: [PATCH] Explain generate_subscripts() more clearly
Previous:From: Magnus HaganderDate: 2010-03-17 18:04:12
Subject: Re: The type of ssl_renegotiation_limit

pgsql-hackers by date

Next:From: Tom LaneDate: 2010-03-18 14:40:32
Subject: Re: WIP: shared ispell dictionary
Previous:From: Pavel StehuleDate: 2010-03-18 12:06:04
Subject: Re: WIP: shared ispell dictionary

pgsql-committers by date

Next:From: Tom LaneDate: 2010-03-18 15:29:45
Subject: pgsql: Fix missing parentheses for current_query(), per bug #5378.
Previous:From: Peter EisentrautDate: 2010-03-18 13:23:57
Subject: pgsql: Use data-type specific conversion functions also in plpy.execute

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group