Re: 7.2.3 vacuum bug

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <rbt(at)rbt(dot)ca>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 7.2.3 vacuum bug
Date: 2002-11-02 03:50:32
Message-ID: 200211020350.gA23oW910502@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Is this a TODO?

---------------------------------------------------------------------------

Tom Lane wrote:
> Rod Taylor <rbt(at)rbt(dot)ca> writes:
> > ERROR: RelationClearRelation: relation 11584078 deleted while still in
> > use
>
> > I've been unable to come up with a test case that will cause the
> > problem, seems to be timing related. The queries that are currently
> > running when these errors occur do a lot or work with temp tables that
> > are frequently truncated.
>
> Hm. vacuum.c tries to avoid this class of problem:
>
> /*
> * Race condition -- if the pg_class tuple has gone away since the
> * last time we saw it, we don't need to vacuum it.
> */
> if (!SearchSysCacheExists(RELOID,
> ObjectIdGetDatum(relid),
> 0, 0, 0))
> {
> CommitTransactionCommand(true);
> return true; /* okay 'cause no data there */
> }
>
> ...
>
> onerel = relation_open(relid, lmode);
>
> but on reflection it's clear that this doesn't really prevent a race
> condition. If the table is already exclusive-locked by a DROP TABLE
> that hasn't committed yet (eg, the implicit DROP that happens when temp
> tables are cleared out at backend exit), then the syscache lookup will
> go fine, but the relation_open() routine blocks waiting for lock and
> eventually fails.
>
> What would probably work better is to first lock the relation OID,
> then see if we can open the relation or not.
>
> Thinking further, it's really kinda bogus that LockRelation() works on
> an already-opened Relation; if possible we should acquire the lock
> before attempting to create a relcache entry. (We only need to know the
> OID and the relisshared status before we can make a locktag, so it'd be
> possible to acquire the lock using only the contents of the pg_class row.)
> Not sure how much code restructuring might be involved to make this
> happen, but it'd be worth thinking about for 7.4.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2002-11-02 03:54:12 Re: 7.3B3 psql talking to a 7.2.3 server?
Previous Message Bruce Momjian 2002-11-02 03:28:36 Re: elog(PANIC) should abort()?