Race conditions in relcache load (again)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Race conditions in relcache load (again)
Date: 2008-04-14 14:25:54
Message-ID: 2010.1208183154@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Awhile back we did some significant rejiggering to ensure that no
relcache load would be attempted without holding at least
AccessShareLock on the relation. (Otherwise, if someone else
is in process of making an update to one of the system catalog
rows defining the relation, there's a race condition for SnapshotNow
scans: the new row version might not be committed when you scan it,
and if you come to the old row version second, it could be committed
dead by the time you scan it, and then you don't see the row at all.)

While thinking about Darren Reed's repeat trouble report
http://archives.postgresql.org/pgsql-admin/2008-04/msg00113.php
I realized that we failed to plug all the gaps of this type,
because relcache.c contains *internal* cache load/reload operations
that aren't protected. In particular the LOAD_CRIT_INDEX macro
calls invoke relcache load on indexes that aren't locked. So they'd
be at risk from a concurrent REINDEX or similar on those system
indexes. RelationReloadIndexInfo seems at risk as well.

AFAICS this doesn't explain Darren's problem because it would only
be a transient failure at the instant of committing the REINDEX;
and whatever he's being burnt by has persistent effects. Nonetheless
it sure looks like a bug. Anyone think it isn't necessary to lock
the target relation here?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message PFC 2008-04-14 14:51:10 Re: Cached Query Plans (was: global prepared statements)
Previous Message Csaba Nagy 2008-04-14 14:20:32 Re: Cached Query Plans (was: global prepared statements)