Skip site navigation (1) Skip section navigation (2)

Re: Teaching pg_receivexlog to follow timeline switches

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Teaching pg_receivexlog to follow timeline switches
Date: 2013-01-16 16:08:31
Message-ID: 50F6D07F.9010207@vmware.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 15.01.2013 20:22, Fujii Masao wrote:
> On Tue, Jan 15, 2013 at 11:05 PM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com>  wrote:
>> Now that a standby server can follow timeline switches through streaming
>> replication, we should do teach pg_receivexlog to do the same. Patch
>> attached.
>>
>> I made one change to the way START_STREAMING command works, to better
>> support this. When a standby server reaches the timeline it's streaming from
>> the master, it stops streaming, fetches any missing timeline history files,
>> and parses the history file of the latest timeline to figure out where to
>> continue. However, I don't want to parse timeline history files in
>> pg_receivexlog. Better to keep it simple. So instead, I modified the
>> server-side code for START_STREAMING to return the next timeline's ID at the
>> end, and used that in pg_receivexlog. I also modifed BASE_BACKUP to return
>> not only the start XLogRecPtr, but also the corresponding timeline ID.
>> Otherwise we might try to start streaming from wrong timeline if you issue a
>> BASE_BACKUP at the same moment the server switches to a new timeline.
>>
>> When pg_receivexlog switches timeline, what to do with the partial file on
>> the old timeline? When the timeline changes in the middle of a WAL segment,
>> the segment old the old timeline is only half-filled. For example, when
>> timeline changes from 1 to 2, you'll have this in pg_xlog:
>>
>> 000000010000000000000006
>> 000000010000000000000007
>> 000000010000000000000008
>> 000000020000000000000008
>> 00000002.history
>>
>> The segment 000000010000000000000008 is only half-filled, as the timeline
>> changed in the middle of that segment. The beginning portion of that file is
>> duplicated in 000000020000000000000008, with the timeline-changing
>> checkpoint record right after the duplicated portion.
>>
>> When we stream that with pg_receivexlog, and hit the timeline switch, we'll
>> have this situation in the client:
>>
>> 000000010000000000000006
>> 000000010000000000000007
>> 000000010000000000000008.partial
>>
>> What to do with the partial file? One option is to rename it to
>> 000000010000000000000008. However, if you then kill pg_receivexlog before it
>> has finished streaming a full segment from the new timeline, on restart it
>> will try to begin streaming WAL segment 000000010000000000000009, because it
>> sees that segment 000000010000000000000008 is already completed. That'd be
>> wrong.
>
> Can't we rename .partial file safely after we receive a full segment
> of the WAL file
> with new timeline and the same logid/segmentid?

I'd prefer to leave the .partial suffix in place, as the segment really 
isn't complete. It doesn't make a difference when you recover to the 
latest timeline, but if you have a more complicated scenario with 
multiple timelines that are still "alive", ie. there's a server still 
actively generating WAL on that timeline, you'll easily get confused.

As an example, imagine that you have a master server, and one standby. 
You maintain a WAL archive for backup purposes with pg_receivexlog, 
connected to the standby. Now, for some reason, you get a split-brain 
situation and the standby server is promoted with new timeline 2, while 
the real master is still running. The DBA notices the problem, and kills 
the standby and pg_receivexlog. He deletes the XLOG files belonging to 
timeline 2 in pg_receivexlog's target directory, and re-points 
pg_recevexlog to the master while he re-builds the standby server from 
backup. At that point, pg_receivexlog will start streaming from the end 
of the zero-padded segment, not knowing that it was partial, and you 
have a hole in the archived WAL stream. Oops.

The DBA could avoid that by also removing the last WAL segment on 
timeline 1, the one that was partial. But it's really not obvious that 
there's anything wrong with that segment. Keeping the .partial suffix 
makes it clear.

- Heikki


In response to

Responses

pgsql-hackers by date

Next:From: Stephen FrostDate: 2013-01-16 16:12:29
Subject: Re: log_lock_waits to identify transaction's relation
Previous:From: Noah MischDate: 2013-01-16 16:02:10
Subject: Re: Parallel query execution

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group