Skip site navigation (1) Skip section navigation (2)

Fast promotion failure

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Fast promotion failure
Date: 2013-05-07 09:57:14
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
While testing the bug from the "Assertion failure at standby promotion", 
I bumped into a different bug in fast promotion. When the first 
checkpoint after fast promotion is performed, there is no guarantee that 
the checkpointer process is running with the correct, new, 
ThisTimeLineID. In CreateCheckPoint(), we have this:

> 	/*
> 	 * An end-of-recovery checkpoint is created before anyone is allowed to
> 	 * write WAL. To allow us to write the checkpoint record, temporarily
> 	 * enable XLogInsertAllowed.  (This also ensures ThisTimeLineID is
> 	 * initialized, which we need here and in AdvanceXLInsertBuffer.)
> 	 */
> 		LocalSetXLogInsertAllowed();

That ensures that ThisTimeLineID is updated when performing an 
end-of-recovery checkpoint, but it doesn't get executed with fast 
promotion. The consequence is that the checkpoint is created with the 
old timeline, and subsequent recovery from it will fail.

I ran into this with the attached script. It sets up a master (M), a 
standby (B), and a cascading standby (C). I'm not sure why, but when I 
tried to simplify the script by removing the cascading standby, it 
started to work. The bug occurs in standby B, so I'm not sure why the 
presence of the cascading standby makes any difference. Maybe it just 
affects the timing.

- Heikki

Description: application/x-shellscript (1.8 KB)


pgsql-hackers by date

Next:From: Simon RiggsDate: 2013-05-07 12:38:04
Subject: Re: Recovery target 'immediate'
Previous:From: Fabien COELHODate: 2013-05-07 09:20:34
Subject: Re: [PATCH] add long options to pgbench (submission 1)

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group