Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Dorochevsky, Michel" <michel(dot)dorochevsky(at)softcon(dot)de>, pgsql-bugs(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Date: 2007-04-24 12:50:17
Message-ID: 462DFD09.8000508@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers pgsql-patches

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Also, we have a generic issue that making fresh entries in a hashtable
>> might result in a concurrent hash_seq_search scan visiting existing
>> entries more than once; that's definitely not something any of the
>> existing callers are thinking about.
>
> Ouch. Note that we can also miss some entries altogether, which is
> probably even worse.

In case someone is wondering how that can happen, here's an example.
We're scanning a bucket that contains four entries, and we split it
after returning 1:

1 -> 2* -> 3 -> 4

* denotes the next entry the seq scan has stored.

If this is split goes example like this:

1 -> 3
2* -> 4

The seq scan will continue scanning from 2, then 4, and miss 3 altogether.

I briefly went through all callers of hash_seq_init. The only place
where we explicitly rely on being able to add entries to a hash table
while scanning it is in tbm_lossify. There's more complex loops in
portalmem.c and relcache.c, which I think are safe, but would need to
look closer. There's also the pg_prepared_statement
set-returning-function that keeps a scan open across calls, which seems
error-prone.

Should we document the fact that it's not safe to insert new entries to
a hash table while scanning it, and fix the few call sites that do that,
or does anyone see a better solution? One alternative would be to
inhibit bucket splits while a scan is in progress, but then we'd need to
take care to clean up after each scan.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message CarlosCastaneda 2007-04-24 13:04:54 BUG #3251: selection of numeric columns from MS SQL Server return wron values
Previous Message Magnus Hagander 2007-04-24 12:26:02 Re: BUG #3242: FATAL: could not unlock semaphore: error code 298

Browse pgsql-hackers by date

  From Date Subject
Next Message Golden Liu 2007-04-24 12:54:01 Google SoC: column-level privilege subsystem
Previous Message Magnus Hagander 2007-04-24 12:26:02 Re: BUG #3242: FATAL: could not unlock semaphore: error code 298

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-04-24 15:06:52 Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Previous Message Heikki Linnakangas 2007-04-24 11:51:34 Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect