Shaky coding for vacuuming partitioned relations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Shaky coding for vacuuming partitioned relations
Date: 2017-09-22 19:13:10
Message-ID: 25023.1506107590@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Somebody inserted this into vacuum.c's get_rel_oids():

tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tuple))
elog(ERROR, "cache lookup failed for relation %u", relid);

apparently without having read the very verbose comment two lines above,
which points out that we're not taking any lock on the target relation.
So, if that relation is concurrently being dropped, you're likely to
get "cache lookup failed for relation NNNN" rather than anything more
user-friendly.

A minimum-change fix would be to replace the elog() with an ereport
that produces the same "relation does not exist" error you'd have
gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed
a few microseconds earlier. But that feels like its's band-aiding
around the problem.

What I'm wondering about is changing the RangeVarGetRelid call to take
ShareUpdateExclusiveLock rather than no lock. That would protect the
syscache lookup, and it would also make the find_all_inheritors call
a lot more meaningful.

If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped
as soon as we close the caller's transaction, and then we'd acquire
the same or stronger lock inside vacuum_rel(). So that seems fine.
If we're doing an ANALYZE, then the lock would continue to be held
and analyze_rel would merely be acquiring it an extra time, so we'd
actually be removing a race-condition failure scenario for ANALYZE.
This would mean a few more cycles in lock management, but since this
only applies to a manual VACUUM or ANALYZE that specifies a table
name, I'm not too concerned about that.

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-09-22 19:31:12 Re: Re: proposal - psql: possibility to specify sort for describe commands, when size is printed
Previous Message Peter Eisentraut 2017-09-22 19:12:29 Re: Re: proposal - psql: possibility to specify sort for describe commands, when size is printed