Skip site navigation (1) Skip section navigation (2)

Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>,Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)
Date: 2005-10-31 17:42:38
Message-ID: 20051031174238.GF20349@pervasive.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
> I'd like Jim to test this theory by seeing if it helps to reverse the
> order of the if-test elements at lines 294/295, ie make it look like
> 
>         if (shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
>             shared->page_number[slotno] != pageno)
> 
> This won't do as a permanent patch, because it isn't guaranteed to fix
> the problem on machines that don't strongly order writes, but it should
> work on Opterons, at least well enough to confirm the diagnosis.

Given your proposed fix on -patches, do you still need me to test this?
Also, is there any heap corruption risk associated with this patch?

I'm also wondering what the effect of this is when assertions are turned
off. My client had to go back to running with assertions turned off
because of the performance impact. Are they now risking data corruption?
Is there a way to turn on the assertion just in this code segment?

This incident has made me wonder if it's worth creating two classes of
assertions. The (hopefully more common) set of assertions would be for
things that shouldn't happen, but if go un-caught won't result in heap
corruption. A new set (well, existing asserts, but just re-classified)
would be for things that if uncaught could result in heap corruption. My
hope is that the set of critical assertions could be turned on by
default, helping to identify race conditions and other bugs that
conventional testing is unlikely to find.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby(at)pervasive(dot)com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

In response to

Responses

pgsql-hackers by date

Next:From: Jim C. NasbyDate: 2005-10-31 17:46:19
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)
Previous:From: Tom LaneDate: 2005-10-31 17:03:54
Subject: Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

pgsql-patches by date

Next:From: Jim C. NasbyDate: 2005-10-31 17:46:19
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)
Previous:From: Tom LaneDate: 2005-10-31 17:03:54
Subject: Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group