Skip site navigation (1) Skip section navigation (2)

Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Dorochevsky, Michel" <michel(dot)dorochevsky(at)softcon(dot)de>, pgsql-bugs(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Date: 2007-04-24 12:50:17
Message-ID: 462DFD09.8000508@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackerspgsql-patches
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Also, we have a generic issue that making fresh entries in a hashtable
>> might result in a concurrent hash_seq_search scan visiting existing
>> entries more than once; that's definitely not something any of the
>> existing callers are thinking about.
> 
> Ouch. Note that we can also miss some entries altogether, which is 
> probably even worse.

In case someone is wondering how that can happen, here's an example. 
We're scanning a bucket that contains four entries, and we split it 
after returning 1:

1 -> 2* -> 3 -> 4

* denotes the next entry the seq scan has stored.

If this is split goes example like this:

1 -> 3
2* -> 4

The seq scan will continue scanning from 2, then 4, and miss 3 altogether.

I briefly went through all callers of hash_seq_init. The only place 
where we explicitly rely on being able to add entries to a hash table 
while scanning it is in tbm_lossify. There's more complex loops in 
portalmem.c and relcache.c, which I think are safe, but would need to 
look closer. There's also the pg_prepared_statement 
set-returning-function that keeps a scan open across calls, which seems 
error-prone.

Should we document the fact that it's not safe to insert new entries to 
a hash table while scanning it, and fix the few call sites that do that, 
or does anyone see a better solution? One alternative would be to 
inhibit bucket splits while a scan is in progress, but then we'd need to 
take care to clean up after each scan.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Golden LiuDate: 2007-04-24 12:54:01
Subject: Google SoC: column-level privilege subsystem
Previous:From: Magnus HaganderDate: 2007-04-24 12:26:02
Subject: Re: BUG #3242: FATAL: could not unlock semaphore: error code 298

pgsql-bugs by date

Next:From: CarlosCastanedaDate: 2007-04-24 13:04:54
Subject: BUG #3251: selection of numeric columns from MS SQL Server return wron values
Previous:From: Magnus HaganderDate: 2007-04-24 12:26:02
Subject: Re: BUG #3242: FATAL: could not unlock semaphore: error code 298

pgsql-patches by date

Next:From: Tom LaneDate: 2007-04-24 15:06:52
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Previous:From: Heikki LinnakangasDate: 2007-04-24 11:51:34
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group