Re: Add SKIP LOCKED to VACUUM and ANALYZE

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Andres Freund" <andres(at)anarazel(dot)de>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Lyes Ameddah <lyes(dot)amd(at)gmail(dot)com>
Subject: Re: Add SKIP LOCKED to VACUUM and ANALYZE
Date: 2018-07-18 16:58:48
Message-ID: E6C879D1-F33A-4875-903D-5C21E2CCE664@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Michael,

Thanks for taking a look.

On 7/17/18, 1:22 AM, "Michael Paquier" <michael(at)paquier(dot)xyz> wrote:
> The first thing which is striking me is that we may actually *not* want
> to check for lock skipping within expand_vacuum_rel() as that's mainly a
> function aimed at building the relations which are going to be vacuumed,
> and that all the lock skipping operations should happen at the beginning
> of vacuum_rel(). This way, even if the parent of a partition tree is
> listed but cannot have its RangeVar opened immediately, then we have a
> chance to have some of its partitions to be vacuumed after listing them.
> This point is debatable as there are pros and cons. I would be of the
> opinion to not change expand_vacuum_rel()'s mission to build the list of
> relations to VACUUM, and actually complain about a lock not available
> when the relation is ready to be processed individually. It is also
> perfectly possible to skip locks for partitions by listing them
> individually in the VACUUM command used. I am pretty sure that Andres
> would complain at the sight of this paragraph as this is backwards with
> the reason behind the refactoring of RangeVarGetRelidExtended but I like
> the new API better :p

I see your point. The only time that expand_vacuum_rel() will block
is due to a conflicting AccessExclusiveLock on the relation, and the
reason we take a lock on the relation temporarily in
expand_vacuum_rel() is documented as follows:

/*
* We transiently take AccessShareLock to protect the syscache lookup
* below, as well as find_all_inheritors's expectation that the caller
* holds some lock on the starting relation.
*/

> For this part, it seems to me that we can do better than what is in
> HEAD:
> - Call RangeVarGetRelidExtended without lock.
> - Check for has_subclass(), which should be extended so as it does not
> complain because of a missing relation so as vacuum.c does the error
> handling.
> - Call RangeVarGetRelidExtended a second time if a lookup with
> find_all_inheritors is necessary. If the relation goes missing
> in-between then we just get an error as the current behavior.

Perhaps we could extend RangeVarGetRelidExtended() to only lock if
has_subclass() is true. However, I also understand Robert's position
on calling RangeVarGetRelidExtended() with NoLock, and I'm not sure if
this locking optimization is worth the complexity.

> I am also questioning the fact of complicating vac_open_indexes() by
> skipping a full relation vacuum if one of its indexes cannot be opened,
> which would be the case for an ALTER INDEX for example. It seems to me
> that skipping locks should just happen for the parent relation, so I
> would not bother much about this case, and just document the behavior.
> If somebody argues for this option, we could always have a SKIP_INDEXES
> or such.

As long as we document this behavior, that seems reasonable to me.

> "SKIP LOCKED" should be named "SKIP_LOCKED", with an underscore for
> consistency with the other options.

I chose SKIP LOCKED with a space because both words are already
reserved and SELECT uses it this way. I'm fine with adding an
underscore for consistency with DISABLE_PAGE_SKIPPING, though.

> -void
> -cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose)
> +bool
> +cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose,
> -bool skip_locked)
> {
> Some refactoring of CLUSTER on the way here? It would be nice to avoid
> having three boolean arguments here, and opens the door for an extended
> CLUSTER grammar with parenthesis as well as a SKIP_LOCKED option for
> it. And this could be refactored first. VACUUM FULL being a variant of
> CLUSTER, we could just reuse the same options... Now using separate
> option names could be cleaner.

Refactoring the boolean arguments of cluster_rel() into an options
argument seems like a good idea.

> The stuff of get_elevel_for_skipped_relation could be refactored into
> something used as well by cluster_rel as the same logic is used in three
> places (vacuum_rel, analyze_rel and cluster_rel with try_relation_open).

This seems like a good idea if we intend to add SKIP LOCKED to CLUSTER
eventually, and it would be nice to cut down on some of the duplicated
ereport() calls. I'll look into it.

Nathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-07-18 16:59:30 Re: [HACKERS] logical decoding of two-phase transactions
Previous Message David G. Johnston 2018-07-18 15:41:26 Re: One transaction and several processes