Re: Shaky coding for vacuuming partitioned relations

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Shaky coding for vacuuming partitioned relations
Date: 2017-09-25 03:10:50
Message-ID: CAB7nPqQa37OUne_rJzpmWC4EXQaLX9f27-Ma_-rSFL_3mj+HFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 23, 2017 at 4:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Somebody inserted this into vacuum.c's get_rel_oids():
>
> tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
> if (!HeapTupleIsValid(tuple))
> elog(ERROR, "cache lookup failed for relation %u", relid);
>
> apparently without having read the very verbose comment two lines above,
> which points out that we're not taking any lock on the target relation.
> So, if that relation is concurrently being dropped, you're likely to
> get "cache lookup failed for relation NNNN" rather than anything more
> user-friendly.

This has been overlooked during the lookups of 3c3bb99, and by
multiple people including me. elog() should never be things users can
face, as well as cache lookups.

> A minimum-change fix would be to replace the elog() with an ereport
> that produces the same "relation does not exist" error you'd have
> gotten from RangeVarGetRelid, had the concurrent DROP TABLE committed
> a few microseconds earlier. But that feels like its's band-aiding
> around the problem.

Yeah, that's not right. This is a cache lookup error at the end.

> What I'm wondering about is changing the RangeVarGetRelid call to take
> ShareUpdateExclusiveLock rather than no lock. That would protect the
> syscache lookup, and it would also make the find_all_inheritors call
> a lot more meaningful.
>
> If we're doing a VACUUM, the ShareUpdateExclusiveLock would be dropped
> as soon as we close the caller's transaction, and then we'd acquire
> the same or stronger lock inside vacuum_rel(). So that seems fine.
> If we're doing an ANALYZE, then the lock would continue to be held
> and analyze_rel would merely be acquiring it an extra time, so we'd
> actually be removing a race-condition failure scenario for ANALYZE.
> This would mean a few more cycles in lock management, but since this
> only applies to a manual VACUUM or ANALYZE that specifies a table
> name, I'm not too concerned about that.

I think that I am +1 on that. Testing such a thing I am not seeing
anything wrong either. The call to find_all_inheritors should also use
ShareUpdateExclusiveLock, and the lock on top of RangeVarGetRelid()
needs to be reworked.

Attached is a proposal of patch.

> Thoughts?

As long as I don't forget... Another thing currently on HEAD and
REL_10_STABLE is that OIDs of partitioned tables are used, but the
RangeVar of the parent is used for error reports. This leads to
incorrect reports if a partition gets away in the middle of autovacuum
as only information about the parent is reported to the user. I am not
saying that this needs to be fixed in REL_10_STABLE at this stage
though as this would require some refactoring similar to what the
patch adding support for VACUUM with multiple relations does. But I
digress here.
--
Michael

Attachment Content-Type Size
vacuum-partition-locks.patch application/octet-stream 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-09-25 04:09:19 Re: What's with all the fflush(stderr) calls in pg_standby.c?
Previous Message Michael Paquier 2017-09-25 01:01:35 Re: What's with all the fflush(stderr) calls in pg_standby.c?