Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slru.c race condition (was Re: TRAP: FailedAssertion("!((itemid)->lp_flags
Date: 2005-10-31 18:01:14
Message-ID: 200510311801.j9VI1EN10692@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Jim C. Nasby wrote:
> On Sun, Oct 30, 2005 at 06:17:53PM -0500, Tom Lane wrote:
> > I'd like Jim to test this theory by seeing if it helps to reverse the
> > order of the if-test elements at lines 294/295, ie make it look like
> >
> > if (shared->page_status[slotno] != SLRU_PAGE_READ_IN_PROGRESS ||
> > shared->page_number[slotno] != pageno)
> >
> > This won't do as a permanent patch, because it isn't guaranteed to fix
> > the problem on machines that don't strongly order writes, but it should
> > work on Opterons, at least well enough to confirm the diagnosis.
>
> Given your proposed fix on -patches, do you still need me to test this?
> Also, is there any heap corruption risk associated with this patch?

Because it is a test, I am not sure there is any way to know what the
possible impact of a bug is. If we knew there were bug in the patch,
it would have been fixed already.

> I'm also wondering what the effect of this is when assertions are turned
> off. My client had to go back to running with assertions turned off
> because of the performance impact. Are they now risking data corruption?
> Is there a way to turn on the assertion just in this code segment?
>
> This incident has made me wonder if it's worth creating two classes of
> assertions. The (hopefully more common) set of assertions would be for
> things that shouldn't happen, but if go un-caught won't result in heap
> corruption. A new set (well, existing asserts, but just re-classified)
> would be for things that if uncaught could result in heap corruption. My
> hope is that the set of critical assertions could be turned on by
> default, helping to identify race conditions and other bugs that
> conventional testing is unlikely to find.

That is probably overkill. Running with test patches isn't something we
expect folks to do often.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2005-10-31 18:01:58 Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags
Previous Message Tom Lane 2005-10-31 18:00:35 Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", )

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2005-10-31 18:01:58 Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags
Previous Message Tom Lane 2005-10-31 18:00:35 Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", )