Re: Proposal of tunable fix for scalability of 8.4

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Proposal of tunable fix for scalability of 8.4
Date: 2009-03-15 19:25:24
Message-ID: BDFBB77C9E07BE4A984DAAE981D19F961AE959DB91@EXVMBX018-1.exch018.msoutlookonline.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Top posting because my email client will mess up the inline:

Re: advance insert pointer.
I have no idea how complicated that advance part is as you allude to. But can this be done without a lock at all?
An atomic compare and exchange (or compare and set, etc) should do it. Although boundaries in buffers could make it a bit more complicated than that. Sounds potentially lockless to me. CompareAndSet - like atomics would prevent context switches entirely and generally work fabulous if the item that needs locking is itself an atomic value like a pointer or int. This is similar to, but lighter weight than, a spin lock.

________________________________________
From: Tom Lane [tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Saturday, March 14, 2009 9:09 AM
To: Heikki Linnakangas
Cc: Robert Haas; Scott Carey; Greg Smith; Jignesh K. Shah; Kevin Grittner; pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Proposal of tunable fix for scalability of 8.4

Yeah, that's been seen to be an issue before. I had the germ of an idea
about how to fix that:

... with no lock, determine size of WAL record ...
obtain WALInsertLock
identify WAL start address of my record, advance insert pointer
past record end
*release* WALInsertLock
without lock, copy record into the space just reserved

The idea here is to allow parallelization of the copying of data into
the buffers. The hold time on WALInsertLock would be very short. Maybe
it could even become a spinlock, though I'm not sure, because the
"advance insert pointer" bit is more complicated than it looks (you have
to allow for the extra overhead when crossing a WAL page boundary).

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jignesh K. Shah 2009-03-15 20:36:56 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Matteo Beccati 2009-03-15 02:38:17 Re: Query performance over a large proportion of data