Skip site navigation (1) Skip section navigation (2)

Re: Teaching pg_receivexlog to follow timeline switches

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Teaching pg_receivexlog to follow timeline switches
Date: 2013-01-16 17:06:48
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On Thu, Jan 17, 2013 at 1:08 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> On 15.01.2013 20:22, Fujii Masao wrote:
>> On Tue, Jan 15, 2013 at 11:05 PM, Heikki Linnakangas
>> <hlinnakangas(at)vmware(dot)com>  wrote:
>>> Now that a standby server can follow timeline switches through streaming
>>> replication, we should do teach pg_receivexlog to do the same. Patch
>>> attached.
>>> I made one change to the way START_STREAMING command works, to better
>>> support this. When a standby server reaches the timeline it's streaming
>>> from
>>> the master, it stops streaming, fetches any missing timeline history
>>> files,
>>> and parses the history file of the latest timeline to figure out where to
>>> continue. However, I don't want to parse timeline history files in
>>> pg_receivexlog. Better to keep it simple. So instead, I modified the
>>> server-side code for START_STREAMING to return the next timeline's ID at
>>> the
>>> end, and used that in pg_receivexlog. I also modifed BASE_BACKUP to
>>> return
>>> not only the start XLogRecPtr, but also the corresponding timeline ID.
>>> Otherwise we might try to start streaming from wrong timeline if you
>>> issue a
>>> BASE_BACKUP at the same moment the server switches to a new timeline.
>>> When pg_receivexlog switches timeline, what to do with the partial file
>>> on
>>> the old timeline? When the timeline changes in the middle of a WAL
>>> segment,
>>> the segment old the old timeline is only half-filled. For example, when
>>> timeline changes from 1 to 2, you'll have this in pg_xlog:
>>> 000000010000000000000006
>>> 000000010000000000000007
>>> 000000010000000000000008
>>> 000000020000000000000008
>>> 00000002.history
>>> The segment 000000010000000000000008 is only half-filled, as the timeline
>>> changed in the middle of that segment. The beginning portion of that file
>>> is
>>> duplicated in 000000020000000000000008, with the timeline-changing
>>> checkpoint record right after the duplicated portion.
>>> When we stream that with pg_receivexlog, and hit the timeline switch,
>>> we'll
>>> have this situation in the client:
>>> 000000010000000000000006
>>> 000000010000000000000007
>>> 000000010000000000000008.partial
>>> What to do with the partial file? One option is to rename it to
>>> 000000010000000000000008. However, if you then kill pg_receivexlog before
>>> it
>>> has finished streaming a full segment from the new timeline, on restart
>>> it
>>> will try to begin streaming WAL segment 000000010000000000000009, because
>>> it
>>> sees that segment 000000010000000000000008 is already completed. That'd
>>> be
>>> wrong.
>> Can't we rename .partial file safely after we receive a full segment
>> of the WAL file
>> with new timeline and the same logid/segmentid?
> I'd prefer to leave the .partial suffix in place, as the segment really
> isn't complete. It doesn't make a difference when you recover to the latest
> timeline, but if you have a more complicated scenario with multiple
> timelines that are still "alive", ie. there's a server still actively
> generating WAL on that timeline, you'll easily get confused.
> As an example, imagine that you have a master server, and one standby. You
> maintain a WAL archive for backup purposes with pg_receivexlog, connected to
> the standby. Now, for some reason, you get a split-brain situation and the
> standby server is promoted with new timeline 2, while the real master is
> still running. The DBA notices the problem, and kills the standby and
> pg_receivexlog. He deletes the XLOG files belonging to timeline 2 in
> pg_receivexlog's target directory, and re-points pg_recevexlog to the master
> while he re-builds the standby server from backup. At that point,
> pg_receivexlog will start streaming from the end of the zero-padded segment,
> not knowing that it was partial, and you have a hole in the archived WAL
> stream. Oops.
> The DBA could avoid that by also removing the last WAL segment on timeline
> 1, the one that was partial. But it's really not obvious that there's
> anything wrong with that segment. Keeping the .partial suffix makes it
> clear.

Thanks for elaborating the reason why .partial suffix should be kept.
I agree that keeping the .partial suffix would be safer.


Fujii Masao

In response to


pgsql-hackers by date

Next:From: Thom BrownDate: 2013-01-16 17:07:29
Subject: Re: Materialized views WIP patch
Previous:From: Kevin GrittnerDate: 2013-01-16 16:48:17
Subject: Re: Materialized views WIP patch

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group