Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",
Date: 2005-10-28 20:58:56
Message-ID: 5407.1130533136@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> All of them have in common that the slotno being passed ($3 below) is in
> SLRU_PAGE_READ_IN_PROGRESS state ... could it be a problem with lock
> reordering? Maybe somebody is trying to read in a page, and somebody
> else steals the buffer from under them. Not sure how likely is that.

It's even more interesting than that: in all three cases,
SlruSelectLRUPage has selected a "least recently used" page that is
still in READ_IN_PROGRESS state (ie, we haven't finished faulting it in)
and is recursively calling SimpleLruReadPage to wait for that condition
to terminate.

Apparently, Jim's setup could desperately do with a larger SLRU arena
for pg_subtrans, because this is supposed to be a never-happen path ---
if you can't finish loading a page before you need its slot for
something else, you are thrashing with a capital T.

I suppose there's a bug in this path, but I'm darned if I can see what
it is. There are a number of obvious inefficiencies, but those
shouldn't be important given that this isn't supposed to happen much.
But how's it getting to the Assert failure?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2005-10-28 21:25:47 Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",
Previous Message Alvaro Herrera 2005-10-28 20:47:09 Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)",