Re: [COMMITTERS] pgsql: Fix inadequate locking during get_rel_oids().

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Cc: pgsql-committers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: Fix inadequate locking during get_rel_oids().
Date: 2018-03-06 00:11:52
Message-ID: 20180306001152.qxguen2kclystz7p@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi Tom,

On 2017-09-29 20:26:42 +0000, Tom Lane wrote:
> get_rel_oids used to not take any relation locks at all, but that stopped
> being a good idea with commit 3c3bb9933, which inserted a syscache lookup
> into the function. A concurrent DROP TABLE could now produce "cache lookup
> failed", which we don't want to have happen in normal operation. The best
> solution seems to be to transiently take a lock on the relation named by
> the RangeVar (which also makes the result of RangeVarGetRelid a lot less
> spongy). But we shouldn't hold the lock beyond this function, because we
> don't want VACUUM to lock more than one table at a time. (That would not
> be a big problem right now, but it will become one after the pending
> feature patch to allow multiple tables to be named in VACUUM.)

I'm not sure how big a problem this is, but I suspect it is
one. Autovacuum used to skip relations when they're locked, and the
vacuum isn't an anti-wraparound one. But after this change it appears
we'll get stuck behind this new lock, even if VACOPT_NOWAIT is
specified. That's bad because now relations that are AEL locked
(e.g. because of some longrunning DDL) can block global autovacuum
progress.

I noticed this when reviewing a patch that adds NOWAIT (now renamed to
SKIP LOCKED) as a user visible knob and it getting stuck behind an
AEL. The updated version of the patch
http://archives.postgresql.org/message-id/7327B413-1A57-477F-A6A0-6FD80CB5482D%40amazon.com
has an attempt at fixing the issue, although I've not yet looked at it
in any sort of detail.

I suspect we should break out that part once reviewed, and backpatch to
10?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andres Freund 2018-03-06 00:34:09 Re: [COMMITTERS] pgsql: Fix inadequate locking during get_rel_oids().
Previous Message Andres Freund 2018-03-05 23:30:25 pgsql: Fix HEAP_INSERT_IS_SPECULATIVE to HEAP_INSERT_SPECULATIVE in com

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-03-06 00:12:55 Re: Faster inserts with mostly-monotonically increasing values
Previous Message David Steele 2018-03-06 00:00:10 Re: 2018-03 CFM