Re: Improve behavior of concurrent ANALYZE/VACUUM

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: "horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Improve behavior of concurrent ANALYZE/VACUUM
Date: 2018-08-20 20:57:00
Message-ID: 720DF5E4-053F-46C7-B11F-550EF2C39F6D@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Sorry for the delay! I looked through the latest patch.

On 8/17/18, 1:43 AM, "Michael Paquier" <michael(at)paquier(dot)xyz> wrote:
> I have reworked the patch on this side, clarifying the use of the new
> common API for the logs. One thing I am wondering about is what do we
> want to do when VACUUM ANALYZE is used. As of HEAD, if vacuum_rel()
> stops, then analyze_rel() is never called, and the only log showing up
> to a non-owner user would be:
> skipping "foo" --- only superuser can vacuum it
>
> With this patch, things become perhaps more confusing by generating two
> WARNING log entries:
> skipping "foo" --- only superuser can vacuum it
> skipping "foo" --- only superuser can analyze it
>
> We could either combine both in a single message, or just generate the
> message for vacuum as HEAD does now. I have also added some simple
> regression tests triggering the skipping logs for shared catalogs,
> non-shared catalogs and non-owners. This could be a separate patch as
> well.

I like the idea of emitting a separate WARNING for each for clarity on
what operations are being skipped. However, I think it could be a bit
confusing with the current wording. Perhaps something like "skipping
vacuum of..." and "skipping analyze of..." would make things clearer.
Another thing to keep in mind is how often only one of these messages
will apply. IIUC the vast majority of VACUUM (ANALYZE) statements
that need to emit such log statements would emit both. Plus, while
VACUUM (ANALYZE) is clearly documented as performing both operations,
I can easily see the argument that users may view it as one big
command and that emitting multiple log entries could be a confusing
change in behavior.

In short, my vote would be to maintain the current behavior for now
and to bring up any logging improvements separately.

+ /*
+ * Depending on the permission checks done while building the list
+ * of relations to work on, it could be possible that the list is
+ * empty, hence do nothing in this case.
+ */
+ if (relations == NIL)
+ return;

It might be better to let the command go through normally so that we
don't miss any cleanup at the end (e.g. deleting vac_context).

+/*
+ * Check if a given relation can be safely vacuumed or not. If the
+ * user is not the relation owner, issue a WARNING log message and return
+ * false to let the caller decide what to do with this relation.
+ */
+bool
+vacuum_is_relation_owner(Oid relid, Form_pg_class reltuple, int options)
+{
+ Assert((options & (VACOPT_VACUUM | VACOPT_ANALYZE)) != 0);
+
+ /*
+ * Check permissions.
+ *
+ * We allow the user to vacuum a table if he is superuser, the table
+ * owner, or the database owner (but in the latter case, only if it's not
+ * a shared relation). pg_class_ownercheck includes the superuser case.
+ *
+ * Note we choose to treat permissions failure as a WARNING and keep
+ * trying to vacuum the rest of the DB --- is this appropriate?
+ */

Do you think it's worth adding ANALYZE to these comments as well?

+ if (!(pg_class_ownercheck(relid, GetUserId()) ||
+ (pg_database_ownercheck(MyDatabaseId, GetUserId()) && !reltuple->relisshared)))

Returning true right away when the role does have permissions might be
a nice way to save a level of indentation.

+ /*
+ * To check whether the relation is a partitioned table and its
+ * ownership, fetch its syscache entry.
+ */
+ tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
+ if (!HeapTupleIsValid(tuple))
+ elog(ERROR, "cache lookup failed for relation %u", relid);
+ classForm = (Form_pg_class) GETSTRUCT(tuple);
+
+ /* check permissions of relation */
+ if (!vacuum_is_relation_owner(relid, classForm, options))
+ {
+ ReleaseSysCache(tuple);
+
+ /*
+ * Release lock again with AccessShareLock -- see below for
+ * the reason why this lock is released.
+ */
+ UnlockRelationOid(relid, AccessShareLock);
+ return vacrels;
+ }

I think this actually changes the behavior for partitioned tables.
Presently, we still go through and collect all the partitions in the
vacrels list. With this change, we will skip collecting a table's
partitions if the current role doesn't have the required permissions.
Perhaps we should skip adding the current relation to vacrels if
vacuum_is_relation_owner() returns false, and then we could go through
the partitions and check permissions on those as well. Since we don't
take any locks on the individual partitions yet, getting the relation
name and calling pg_class_ownercheck() safely might be tricky, though.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-08-20 23:38:34 Re: Temporary tables prevent autovacuum, leading to XID wraparound
Previous Message Alvaro Herrera 2018-08-20 20:21:22 Re: ATTACH/DETACH PARTITION CONCURRENTLY