Skip site navigation (1) Skip section navigation (2)

Re: Proposal of tunable fix for scalability of 8.4

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Scott Carey <scott(at)richrelevance(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-18 07:48:38
Message-ID: 1237362518.3953.181.camel@ebony.2ndQuadrant (view raw or flat)
Thread:
Lists: pgsql-performance
On Sat, 2009-03-14 at 12:09 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > WALInsertLock is also quite high on Jignesh's list. That I've seen 
> > become the bottleneck on other tests too.
> 
> Yeah, that's been seen to be an issue before.  I had the germ of an idea
> about how to fix that:
> 
> 	... with no lock, determine size of WAL record ...
> 	obtain WALInsertLock
> 	identify WAL start address of my record, advance insert pointer
> 		past record end
> 	*release* WALInsertLock
> 	without lock, copy record into the space just reserved
> 
> The idea here is to allow parallelization of the copying of data into
> the buffers.  The hold time on WALInsertLock would be very short.  Maybe
> it could even become a spinlock, though I'm not sure, because the
> "advance insert pointer" bit is more complicated than it looks (you have
> to allow for the extra overhead when crossing a WAL page boundary).
> 
> Now the fly in the ointment is that there would need to be some way to
> ensure that we didn't write data out to disk until it was valid; in
> particular how do we implement a request to flush WAL up to a particular
> LSN value, when maybe some of the records before that haven't been fully
> transferred into the buffers yet?  The best idea I've thought of so far
> is shared/exclusive locks on the individual WAL buffer pages, with the
> rather unusual behavior that writers of the page would take shared lock
> and only the reader (he who has to dump to disk) would take exclusive
> lock.  But maybe there's a better way.  Currently I don't believe that
> dumping a WAL buffer (WALWriteLock) blocks insertion of new WAL data,
> and it would be nice to preserve that property.

Yeh, that's just what we'd discussed previously:
http://markmail.org/message/gectqy3yzvjs2hru#query:Reworking%20WAL%
20locking+page:1+mid:gectqy3yzvjs2hru+state:results

Are you thinking of doing this for 8.4? :-)

-- 
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


In response to

pgsql-performance by date

Next:From: Simon RiggsDate: 2009-03-18 07:53:53
Subject: Re: Proposal of tunable fix for scalability of 8.4
Previous:From: Tom LaneDate: 2009-03-18 03:24:59
Subject: Re: Extremely slow intarray index creation and inserts.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group