Skip site navigation (1) Skip section navigation (2)

Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: "Dorochevsky, Michel" <michel(dot)dorochevsky(at)softcon(dot)de>, pgsql-bugs(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Date: 2007-04-24 15:06:52
Message-ID: 3393.1177427212@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackerspgsql-patches
Heikki Linnakangas <heikki(at)enterprisedb(dot)com> writes:
> I briefly went through all callers of hash_seq_init. The only place 
> where we explicitly rely on being able to add entries to a hash table 
> while scanning it is in tbm_lossify. There's more complex loops in 
> portalmem.c and relcache.c, which I think are safe, but would need to 
> look closer. There's also the pg_prepared_statement 
> set-returning-function that keeps a scan open across calls, which seems 
> error-prone.

The pending-fsync stuff in md.c is also expecting to be able to add
entries during a scan.

I don't think we can go in the direction of forbidding insertions during
a scan --- as the case at hand shows, it's just not always obvious that
that could happen, and finding/fixing such a problem is nigh impossible.
(We were darn fortunate to be able to reproduce this one.)  Plus we have
a couple of places where it's really necessary to be able to do it,
anyway.

The only answer I can see that seems reasonably robust is to change
dynahash.c so that it tracks whether any seq_search scans are open on a
hashtable, and doesn't carry out any splits while one is.  This wouldn't
cost anything noticeable in performance, assuming that not very many
splits are postponed.  The PITA aspect of it is that we'd need to add
bookkeeping mechanisms to ensure that the count of active scans gets
cleaned up on error exit.  It's not like we've not got lots of those,
though.

Possibly we could simplify matters a bit by not worrying about cleaning
up leaked counts at subtransaction abort, ie, the list of open scans
would only get forced to empty at top transaction end.  This carries a
slightly higher risk of meaningful performance degradation, but in
practice I doubt it's a big problem.  If we agreed that then we'd not
need ResourceOwner support --- it could be handled like LWLock counts.

pg_prepared_statement is simply broken --- what if the next-to-scan
statement is deleted between calls?  It'll have to be changed.

Comments?

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Josh BerkusDate: 2007-04-24 15:31:42
Subject: Re: [HACKERS] Full page writes improvement, code update
Previous:From: Marko KreenDate: 2007-04-24 15:04:19
Subject: Re: RESET command seems pretty disjointed now

pgsql-bugs by date

Next:From: William LawranceDate: 2007-04-24 16:29:45
Subject: Re: [HACKERS] BUG #3244: problem with PREPARE
Previous:From: Bruce MomjianDate: 2007-04-24 14:10:03
Subject: Re: [HACKERS] BUG #3244: problem with PREPARE

pgsql-patches by date

Next:From: Josh BerkusDate: 2007-04-24 15:31:42
Subject: Re: [HACKERS] Full page writes improvement, code update
Previous:From: Heikki LinnakangasDate: 2007-04-24 12:50:17
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group