Re: Deadlock in vacuum (check fails)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deadlock in vacuum (check fails)
Date: 2010-01-13 19:27:58
Message-ID: 1582.1263410878@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> I found following strange error on gothic moth:

> VACUUM FULL pg_class;
> + ERROR: deadlock detected
> + DETAIL: Process 5913 waits for AccessExclusiveLock on relation 2662
> of database 16384; blocked by process 5915.
> + Process 5915 waits for AccessShareLock on relation 1259 of database
> 16384; blocked by process 5913.
> + HINT: See server log for query details.

The server log shows that 5913 was trying to VACUUM FULL pg_class, and
5915 was in the midst of backend startup. I believe that what is
happening is that 5915 was doing a full startup without relcache init
file (which had likely been deleted by a previous vacuum) and it was
in the middle of load_critical_index() for index 2662 =
pg_class_oid_index. It would have needed to read pg_class for that.
Meanwhile the VACUUM FULL had ex-lock on pg_class and needed to lock
its indexes.

So basically what this boils down to is that load_critical_index is
locking an index before locking its underlying relation, which generally
speaking is against our coding rules. The relcache code has been like
that since 8.2, so I'm a bit surprised that we have not seen this
reported before. It could only happen when the relcache init file is
missing, which isn't the normal state, but it's not exactly unusual
either.

Probably the appropriate fix is to make load_critical_index get
AccessShare lock on the underlying catalog not only the index it's
after. This would slow things down a tad, but since it's not in the
normal startup path I don't think that matters.

Should we back-patch this? The bug appears to be real back to 8.2,
but if we've not noticed before, I'm not sure how important it is.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tim Bunce 2010-01-13 19:42:35 Add utility functions to plperl [PATCH]
Previous Message Stephen Frost 2010-01-13 19:27:11 Re: [PATCH] remove redundant ownership checks