Skip site navigation (1) Skip section navigation (2)

Load Distributed Checkpoints, take 3

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Load Distributed Checkpoints, take 3
Date: 2007-06-20 13:47:31
Message-ID: 46792FF3.8000301@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-patches
Here's an updated WIP patch for load distributed checkpoints.

I added a spinlock to protect the signaling fields between bgwriter and 
backends. The current non-locking approach gets really difficult as the 
patch adds two new flags, and both are more important than the existing 
ckpt_time_warn flag.

In fact, I think there's a small race condition in CVS HEAD:

1. pg_start_backup() is called, which calls RequestCheckpoint
2. RequestCheckpoint takes note of the old value of ckpt_started
3. bgwriter wakes up from pg_usleep, and sees that we've exceeded 
checkpoint_timeout.
4. bgwriter increases ckpt_started to note that a new checkpoint has started
5. RequestCheckpoint signals bgwriter to start a new checkpoint
6. bgwriter calls CreateCheckpoint, with the force-flag set to false 
because this checkpoint was triggered by timeout
7. RequestCheckpoint sees that ckpt_started has increased, and starts to 
wait for ckpt_done to reach the new value.
8. CreateCheckpoint finishes immediately, because there was no XLOG 
activity since last checkpoint.
9. RequestCheckpoint sees that ckpt_done matches ckpt_started, and returns.
10. pg_start_backup() continues, with potentially the same redo location 
and thus history filename as previous backup.

Now I admit that the chances for that to happen are extremely small, 
people don't usually do two pg_start_backup calls without *any* WAL 
logged activity in between them, for example. But as we add the new 
flags, avoiding scenarios like that becomes harder.

Since last patch, I did some clean up and refactoring, and added a bunch 
of comments, and user documentation.

I haven't yet changed GetInsertRecPtr to use the almost up-to-date value 
protected by the info_lck per Simon's suggestion, and I need to do some 
correctness testing. After that, I'm done with the patch.

Ps. In case you wonder what took me so long since last revision, I've 
spent a lot of time reviewing HOT.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment: ldc-justwrites-3.patch
Description: text/x-diff (53.8 KB)

Responses

pgsql-patches by date

Next:From: Tom LaneDate: 2007-06-20 15:32:29
Subject: Re: [gpoo@ubiobio.cl: Re: [HACKERS] EXPLAIN omits schema?]
Previous:From: Alvaro HerreraDate: 2007-06-20 13:47:00
Subject: Re: more autovacuum fixes

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group