Skip site navigation (1) Skip section navigation (2)

Re: 8.3.5: Crash in CountActiveBackends() - lockless race?

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.3.5: Crash in CountActiveBackends() - lockless race?
Date: 2009-03-30 13:09:34
Message-ID: 49D0C48E.6030602@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Marko Kreen wrote:
> 1.  Add memory barrier to ProcArrayAdd/ProcArrayRemove between pointer
>     and count update.  This guarantees that partial slots will not be seen.
> 
> 2.  Always clear the pointer in ProcArrayRemove and check for NULL
>     in all "lockless" access points.  This guarantees that partial slots
>     will be either NULL or just-freed ones, before the barrier in
>     LWLockRelease(), which means the contents should be still sensible.
> 
> #1 seems to require platform-specific code, which we don't have yet?

Marking the pointer as volatile should work.

> So #2 may be easier solution.

Agreed. And more importantly, it puts the onus of getting it right into 
CountActiveBackends, which is the one who's breaking the rules. We don't 
necessarily need to clear the pointer in ProcArrayRemove either, the 
count doesn't need to be accurate.

Barring objections, I'll do #2:

*** procarray.c.~1.40.~	2008-01-09 23:52:36.000000000 +0200
--- procarray.c	2009-03-30 16:04:00.000000000 +0300
***************
*** 1088,1093 ****
--- 1088,1101 ----
   	for (index = 0; index < arrayP->numProcs; index++)
   	{
   		volatile PGPROC *proc = arrayP->procs[index];
+ 		
+ 		/*
+ 		 * Since we're not holding a lock, need to check that the pointer
+ 		 * is valid. Someone holding the lock could have increased numProcs
+ 		 * already, but not yet assigned a valid pointer to the array.
+ 		 */
+ 		if (proc != NULL)
+ 			continue;

   		if (proc == MyProc)
   			continue;			/* do not count myself */

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Sergey BurladyanDate: 2009-03-30 13:21:38
Subject: Re: gettext, plural form and translation
Previous:From: Gurjeet SinghDate: 2009-03-30 13:04:03
Subject: Re: New trigger option of pg_standby

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group